4,243 results on '"U‐Net"'
Search Results
2. Segmentation and visualization of the Shampula dragonfly eye glass bead CT images using a deep learning method.
- Author
-
Liao, Lingyu, Cheng, Qian, Zhang, Xueyan, Qu, Liang, Liu, Siran, Ma, Shining, Chen, Kunlong, Liu, Yue, Wang, Yongtian, and Song, Weitao
- Abstract
Micro-computed tomography (CT) of ancient Chinese glass dragonfly eye beads has enabled detailed exploration of their internal structures, contributing to our understanding of their manufacture. Segmentation of these CT images is essential but challenging due to variation in grayscale values and the presence of bubbles. This study introduces a U-Net-based model called EBV-SegNet, which enables efficient and accurate segmentation and visualization of these beads. We developed, trained, and tested the model using a dataset comprising four typical Shampula dragonfly eye beads, and the results demonstrated high-precision segmentation and precise delineation of the beads' complex structures. These segmented data were further analyzed using the Visualization Toolkit for advanced volume rendering and reconstruction. Our application of EBV-SegNet to Shampula beads suggests the likelihood of two distinct manufacturing techniques, underscoring the potential of the model for enhancing the analysis of cultural artifacts using three-dimensional visualization and deep learning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Multi-stage semi-supervised learning enhances white matter hyperintensity segmentation.
- Author
-
Duarte, Kauê T. N., Sidhu, Abhijot S., Barros, Murilo C., Gobbi, David G., McCreary, Cheryl R., Saad, Feryal, Camicioli, Richard, Smith, Eric E., Bento, Mariana P., and Frayne, Richard
- Subjects
MACHINE learning ,SUPERVISED learning ,ALZHEIMER'S disease ,CONVOLUTIONAL neural networks ,MILD cognitive impairment - Abstract
Introduction: White matter hyperintensities (WMHs) are frequently observed on magnetic resonance (MR) images in older adults, commonly appearing as areas of high signal intensity on fluid-attenuated inversion recovery (FLAIR) MR scans. Elevated WMH volumes are associated with a greater risk of dementia and stroke, even after accounting for vascular risk factors. Manual segmentation, while considered the ground truth, is both labor-intensive and time-consuming, limiting the generation of annotated WMH datasets. Un-annotated data are relatively available; however, the requirement of annotated data poses a challenge for developing supervised machine learning models. Methods: To address this challenge, we implemented a multi-stage semi-supervised learning (M3SL) approach that first uses un-annotated data segmented by traditional processing methods ("bronze" and "silver" quality data) and then uses a smaller number of "gold"-standard annotations for model refinement. The M3SL approach enabled fine-tuning of the model weights with the gold-standard annotations. This approach was integrated into the training of a U-Net model for WMH segmentation. We used data from three scanner vendors (over more than five scanners) and from both cognitively normal (CN) adult and patients cohorts [with mild cognitive impairment and Alzheimer's disease (AD)]. Results: An analysis of WMH segmentation performance across both scanner and clinical stage (CN, MCI, AD) factors was conducted. We compared our results to both conventional and transfer-learning deep learning methods and observed better generalization with M3SL across different datasets. We evaluated several metrics (F -measure, IoU , and Hausdorff distance) and found significant improvements with our method compared to conventional (p < 0.001) and transfer-learning (p < 0.001). Discussion: These findings suggest that automated, non-machine learning, tools have a role in a multi-stage learning framework and can reduce the impact of limited annotated data and, thus, enhance model performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. U-Net and Its Variants Based Automatic Tracking of Radial Artery in Ultrasonic Short-Axis Views: A Pilot Study.
- Author
-
Tian, Yuan, Gao, Ruiyang, Shi, Xinran, Lang, Jiaxin, Xue, Yang, Wang, Chunrong, Zhang, Yuelun, Shen, Le, Yu, Chunhua, and Zhou, Zhuhuang
- Abstract
Background/Objectives: Radial artery tracking (RAT) in the short-axis view is a pivotal step for ultrasound-guided radial artery catheterization (RAC), which is widely employed in various clinical settings. To eliminate disparities and lay the foundations for automated procedures, a pilot study was conducted to explore the feasibility of U-Net and its variants in automatic RAT. Methods: Approved by the institutional ethics committee, patients as potential RAC candidates were enrolled, and the radial arteries were continuously scanned by B-mode ultrasonography. All acquired videos were processed into standardized images, and randomly divided into training, validation, and test sets in an 8:1:1 ratio. Deep learning models, including U-Net and its variants, such as Attention U-Net, UNet++, Res-UNet, TransUNet, and UNeXt, were utilized for automatic RAT. The performance of the deep learning architectures was assessed using loss functions, dice similarity coefficient (DSC), and Jaccard similarity coefficient (JSC). Performance differences were analyzed using the Kruskal–Wallis test. Results: The independent datasets comprised 7233 images extracted from 178 videos of 135 patients (53.3% women; mean age: 41.6 years). Consistent convergence of loss functions between the training and validation sets was achieved for all models except Attention U-Net. Res-UNet emerged as the optimal architecture in terms of DSC and JSC (93.14% and 87.93%), indicating a significant improvement compared to U-Net (91.79% vs. 86.19%, p < 0.05) and Attention U-Net (91.20% vs. 85.02%, p < 0.05). Conclusions: This pilot study validates the feasibility of U-Net and its variants in automatic RAT, highlighting the predominant performance of Res-UNet among the evaluated architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. The Change Detection of Mangrove Forests Using Deep Learning with Medium-Resolution Satellite Imagery: A Case Study of Wunbaik Mangrove Forest in Myanmar.
- Author
-
Win, Kyaw Soe and Sasaki, Jun
- Abstract
This paper presents the development of a U-Net model using four basic optical bands and SRTM data to analyze changes in mangrove forests from 1990 to 2024, with an emphasis on the impact of restoration programs. The model, which employed supervised learning for binary classification by fusing multi-temporal Landsat 8 and Sentinel-2 imagery, achieved a superior accuracy of 99.73% for the 2020 image classification. It was applied to predict the long-term mangrove maps in Wunbaik Mangrove Forest (WMF) and to detect the changes at five-year intervals. The change detection results revealed significant changes in the mangrove forests, with 29.3% deforestation, 5.75% reforestation, and −224.52 ha/yr of annual rate of changes over 34 years. The large areas of mangrove forests have increased since 2010, primarily due to naturally recovered and artificially planted mangroves. Approximately 30% of the increased mangroves from 2015 to 2024 were attributed to mangrove plantations implemented by the government. This study contributes to developing a deep learning model with multi-temporal and multi-source imagery for long-term mangrove monitoring by providing accurate performance and valuable information for effective conservation strategies and restoration programs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Imbalanced segmentation for abnormal cotton fiber based on GAN and multiscale residual U-Net.
- Author
-
Yang, Shuo, Li, Jingbin, Li, Yang, Nie, Jing, Ercisli, Sezai, and Khan, Muhammad Attique
- Subjects
COTTON fibers ,FIBERS ,PIXELS ,SCARCITY - Abstract
The scale of white foreign fibers in bobbin yarn is small, resulting in multiple types of data imbalance in the dataset. These imbalances include a severe imbalance of foreign fiber pixels compared to background pixels and an imbalance in the size target scale. Consequently, conventional semantic segmentation networks struggle to segment these fibers effectively. First, in tackling the scarcity of white foreign fiber instances within bobbin yarn samples, this research utilizes original foreign fiber images to train the DCGAN and generate adequate training samples. Secondly, a multiscale residual U-Net is constructed to extract foreign fiber features from different scales. The network is encouraged to learn semantic features at each scale and each layer of the decoding stage. This overcomes the problem of scale imbalance in the foreign fiber dataset and enhances the model's capability to extract weak semantic information from small targets. Thirdly, a weighted binary cross-entropy loss function is integrated into the network's training phase to rectify the class imbalance and refine segmentation performance. This function adjusts the weighting of foreign fiber pixel data, thereby addressing the disproportionate distribution between foreign fibers and background pixels within the dataset. Finally, the proposed method is experimentally validated using a dataset of white foreign fibers. The experimental results show that the proposed method achieves better results in the critical evaluation metrics, as evidenced by the accuracy of 97.52 %, the MIoU of 95.26 %, the DICE coefficient of 81.29 %, and the F1 Score of 84.92 %. These statistics demonstrate the method's efficacy in achieving high-precision segmentation of white foreign fibers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. UR-Net: An Optimized U-Net for Color Painting Segmentation.
- Author
-
Liu, Zhen, Fan, Shuo, Liu, Silu, and Liu, Li
- Abstract
The pigments of cultural color paintings have faded with the passage of time. Color segmentations are essential for digital color reconstruction, but the complexity of color paintings makes it challenging to achieve high-precision segmentation using previous methods. To address the challenges of color painting segmentation, an optimized strategy based on U-Net is proposed in this paper. The residual blocks of a residual network (ResNet) are added to the original U-Net architecture, and a UR-Net is constructed for the semantic segmentation of color paintings. The following steps are taken. First, datasets of color paintings are obtained as training and test samples and are labeled with the two following pixel colors: earth red and ultramarine blue. Second, residual blocks are improved and added to fit the U-Net architecture. Then, a UR-Net is constructed and trained using the samples to obtain the semantic segmentation model. Finally, the effectiveness of the trained UR-Net model for segmenting the test samples is evaluated, and it is compared with the K-means clustering algorithm, ResNet, and U-Net. Data from several studies suggest that the segmentation accuracy of the UR-Net model is higher than that of other methods for the color segmentation of painted images, and the IoUs of the segmented earth red and ultramarine blue pixels are 0.9346 and 0.9259, respectively, achieving the desired results. The proposed UR-Net model provides theoretical and methodological support for further in-depth research on color recognition and segmentation of cultural color paintings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Segmentation and classification of white blood SMEAR images using modified CNN architecture.
- Author
-
Kumar, Indrajeet and Rawat, Jyoti
- Abstract
The classification and recognition of leukocytes or WBCs in blood smear images presents a key role in the corresponding diagnosis of specific diseases, such as leukemia, tumor, hematological disorders, etc. The computerized framework for automated segmentation & classification of WBCs nucleus contributes an important role for the recognition of WBCs related disorders. Therefore, this work emphasizes WBCs nucleus segmentation using modified U-Net architecture and the segmented WBCs nucleus are further classified into their subcategory i.e., basophil, eosinophil, neutrophil, monocyte and lymphocyte. The classification and nucleus characterization task has been performed using VGGNet and MobileNet V2 architecture. Initially, collected instances are passed to the preprocessing phase for image rescaling and normalization. The rescaled and normalized instances are passed to the U-Net model for nucleus segmentation. Extracted nucleus are forwarded to the classification phase for their class identifications. Furthermore, the functioning of the intended design will be compared with other modern methods. By the end of this study a successful model classifying various nucleus morphologies such as Basophil, Eosinophil, Lymphocyte, Monocyte and Neutrophil was obtained where overall test accuracy achieved was 97.0% for VGGNet classifier and 94.0% for MobileNet V2 classifier. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Image Denoising Using Deblur Generative Adversarial Network Denoising U-Net.
- Author
-
Usha Rani, B., Aruna, R., Velrajkumar, P., Amuthan, N., and Sivakumar, N.
- Subjects
- *
CONVOLUTIONAL neural networks , *GENERATIVE adversarial networks , *IMAGE denoising , *SIGNAL-to-noise ratio , *RANDOM noise theory - Abstract
Convolutional neural networks (CNNs) are becoming increasingly popular for image denoising. U-Nets, a type of CNN architecture, have been shown to be effective for this task. However, the impact of shallow layers on deeper layers decreases as the depth of the network increases. To address this issue, the authors propose a new image denoising method called DGANDU-Net. DGANDU-Net combines the DeblurGAN design with a U-Net architecture. This combination allows DGANDU-Net to effectively remove noise from images while preserving fine details. The authors also propose the use of two loss functions, mean square error (MSE) and perceptual loss, to improve the performance of DGANDU-Net. MSE is used to learn and improve the extracted features, while perceptual loss is used to produce the final denoised image. The authors evaluate the performance of DGANDU-Net on a variety of noise levels and find that it outperforms other state-of-the-art denoising algorithms in terms of both visual quality and two evaluation indices, including peak signal-to-noise ratio (PSNR) and Structural Similarity Index Measure (SSIM). Specifically, for extremely noisy environments with a noise standard deviation of 75, DGANDU-Net achieves an average PSNR of 37.39dB in the test dataset. The authors conclude that DGANDU-Net is a promising new method for image denoising that has the potential to significantly improve the quality of medical images used for diagnosis and treatment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma.
- Author
-
Kim, Sungjin, Chang, Yongjun, An, Sungjun, Kim, Deokseok, Cho, Jaegu, Oh, Kyungho, Baek, Seungkuk, and Choi, Bo K.
- Subjects
- *
GENERATIVE artificial intelligence , *PREDICTIVE tests , *PREDICTION models , *COMPUTER-assisted image analysis (Medicine) , *RESEARCH funding , *EARLY detection of cancer , *DIAGNOSTIC errors , *LARYNGOSCOPY , *ARTIFICIAL neural networks , *COMPUTER-aided diagnosis , *MACHINE learning ,LARYNGEAL tumors - Abstract
Simple Summary: This study aimed to enhance the accuracy of detecting laryngeal carcinoma using a modified AI model based on U-Net. The model was designed to automatically identify lesions in endoscopic images. Researchers addressed issues such as mode collapse and gradient explosion to ensure stable performance, achieving 99% accuracy in detecting malignancies. The study found that malignant tumors were detected more reliably than benign ones. This technology could help reduce human error in diagnoses, allowing for earlier detection and treatment. Furthermore, it has the potential to be applied in other medical fields, benefiting overall healthcare. This study modifies the U-Net architecture for pixel-based segmentation to automatically classify lesions in laryngeal endoscopic images. The advanced U-Net incorporates five-level encoders and decoders, with an autoencoder layer to derive latent vectors representing the image characteristics. To enhance performance, a WGAN was implemented to address common issues such as mode collapse and gradient explosion found in traditional GANs. The dataset consisted of 8171 images labeled with polygons in seven colors. Evaluation metrics, including the F1 score and intersection over union, revealed that benign tumors were detected with lower accuracy compared to other lesions, while cancers achieved notably high accuracy. The model demonstrated an overall accuracy rate of 99%. This enhanced U-Net model shows strong potential in improving cancer detection, reducing diagnostic errors, and enhancing early diagnosis in medical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. U-Net Semantic Segmentation-Based Calorific Value Estimation of Straw Multifuels for Combined Heat and Power Generation Processes.
- Author
-
Li, Lianming, Wang, Zhiwei, and He, Defeng
- Subjects
- *
TRANSFORMER models , *INDUSTRIALISM , *IMAGE segmentation , *STRAW , *UNITS of time - Abstract
This paper proposes a system for real-time estimation of the calorific value of mixed straw fuels based on an improved U-Net semantic segmentation model. This system aims to address the uncertainty in heat and power generation per unit time in combined heat and power generation (CHPG) systems caused by fluctuations in the calorific value of straw fuels. The system integrates an industrial camera, moisture detector, and quality sensors to capture images of the multi-fuel straw. It applies the improved U-Net segmentation network for semantic segmentation of the images, accurately calculating the proportion of each type of straw. The improved U-Net network introduces a self-attention mechanism in the skip connections of the final layer of the encoder, replacing traditional convolutions by depthwise separable convolutions, as well as replacing the traditional convolutional bottleneck layers with Transformer encoder. These changes ensure that the model achieves high segmentation accuracy and strong generalization capability while maintaining good real-time performance. The semantic segmentation results of the straw images are used to calculate the proportions of different types of straw and, combined with moisture content and quality data, the calorific value of the mixed fuel is estimated in real time based on the elemental composition of each straw type. Validation using images captured from an actual thermal power plant shows that, under the same conditions, the proposed model has only a 0.2% decrease in accuracy compared to the traditional U-Net segmentation network, while the number of parameters is significantly reduced by 74%, and inference speed is improved 23%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Attention-Enhanced Urban Fugitive Dust Source Segmentation in High-Resolution Remote Sensing Images.
- Author
-
He, Xiaoqing, Wang, Zhibao, Bai, Lu, Fan, Meng, Chen, Yuanlin, and Chen, Liangfu
- Subjects
- *
PARTICULATE matter , *DUST control , *REMOTE sensing , *FEATURE extraction , *IMAGE segmentation , *DUST , *FUGITIVE emissions - Abstract
Fugitive dust is an important source of total suspended particulate matter in urban ambient air. The existing segmentation methods for dust sources face challenges in distinguishing key and secondary features, and they exhibit poor segmentation at the image edge. To address these issues, this paper proposes the Dust Source U-Net (DSU-Net), enhancing the U-Net model by incorporating VGG16 for feature extraction, and integrating the shuffle attention module into the jump connection branch to enhance feature acquisition. Furthermore, we combine Dice Loss, Focal Loss, and Activate Boundary Loss to improve the boundary extraction accuracy and reduce the loss oscillation. To evaluate the effectiveness of our model, we selected Jingmen City, Jingzhou City, and Yichang City in Hubei Province as the experimental area and established two dust source datasets from 0.5 m high-resolution remote sensing imagery acquired by the Jilin-1 satellite. Our created datasets include dataset HDSD-A for dust source segmentation and dataset HDSD-B for distinguishing the dust control measures. Comparative analyses of our proposed model with other typical segmentation models demonstrated that our proposed DSU-Net has the best detection performance, achieving a mIoU of 93% on dataset HDSD-A and 92% on dataset HDSD-B. In addition, we verified that it can be successfully applied to detect dust sources in urban areas. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation from CT scans.
- Author
-
Lan, Xiaoke and Jin, Wenbing
- Subjects
- *
COMPUTED tomography , *DEEP learning , *COVID-19 , *DIAGNOSTIC imaging , *STATISTICAL correlation - Abstract
Accurate segmentation of COVID-19 lesions from medical images is essential for achieving precise diagnosis and developing effective treatment strategies. Unfortunately, this task presents significant challenges, owing to the complex and diverse characteristics of opaque areas, subtle differences between infected and healthy tissue, and the presence of noise in CT images. To address these difficulties, this paper designs a new deep-learning architecture (named MD-Net) based on multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation. In our framework, the U-shaped structure serves as the cornerstone to facilitate complex hierarchical representations essential for accurate segmentation. Then, by introducing the multi-scale input layers (MIL), the network can effectively analyze both fine-grained details and contextual information in the original image. Furthermore, we introduce an SE-Conv module in the encoder network, which can enhance the ability to identify relevant information while simultaneously suppressing the transmission of extraneous or non-lesion information. Additionally, we design a dense decoder aggregation (DDA) module to integrate feature distributions and important COVID-19 lesion information from adjacent encoder layers. Finally, we conducted a comprehensive quantitative analysis and comparison between two publicly available datasets, namely Vid-QU-EX and QaTa-COV19-v2, to assess the robustness and versatility of MD-Net in segmenting COVID-19 lesions. The experimental results show that the proposed MD-Net has superior performance compared to its competitors, and it exhibits higher scores on the Dice value, Matthews correlation coefficient (Mcc), and Jaccard index. In addition, we also conducted ablation studies on the Vid-QU-EX dataset to evaluate the contributions of each key component within the proposed architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Automated High-Precision Recognition of Solar Filaments Based on an Improved U 2 -Net.
- Author
-
Jiang, Wendong and Li, Zhengyang
- Subjects
- *
SOLAR magnetic fields , *SOLAR active regions , *SOLAR activity , *SOLAR flares , *DEEP learning , *CORONAL mass ejections ,SOLAR filaments - Abstract
Solar filaments are a significant solar activity phenomenon, typically observed in full-disk solar observations in the H-alpha band. They are closely associated with the magnetic fields of solar active regions, solar flare eruptions, and coronal mass ejections. With the increasing volume of observational data, the automated high-precision recognition of solar filaments using deep learning is crucial. In this study, we processed full-disk H-alpha solar images captured by the Chinese H-alpha Solar Explorer in 2023 to generate labels for solar filaments. The preprocessing steps included limb-darkening removal, grayscale transformation, K-means clustering, particle erosion, multiple closing operations, and hole filling. The dataset containing solar filament labels is constructed for deep learning. We developed the Attention U2-Net neural network for deep learning on the solar dataset by introducing an attention mechanism into U2-Net. In the results, Attention U2-Net achieved an average Accuracy of 0.9987, an average Precision of 0.8221, an average Recall of 0.8469, an average IoU of 0.7139, and an average F1-score of 0.8323 on the solar filament test set, showing significant improvements compared to other U-net variants. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Precision Segmentation of Subretinal Fluids in OCT Using Multiscale Attention-Based U-Net Architecture.
- Author
-
Karn, Prakash Kumar and Abdulla, Waleed H.
- Subjects
- *
MACULAR degeneration , *OPTICAL coherence tomography , *MACULAR edema , *RETINAL diseases , *COMPUTER-assisted image analysis (Medicine) - Abstract
This paper presents a deep-learning architecture for segmenting retinal fluids in patients with Diabetic Macular Oedema (DME) and Age-related Macular Degeneration (AMD). Accurate segmentation of multiple fluid types is critical for diagnosis and treatment planning, but existing techniques often struggle with precision. We propose an encoder–decoder network inspired by U-Net, processing enhanced OCT images and their edge maps. The encoder incorporates Residual and Inception modules with an autoencoder-based multiscale attention mechanism to extract detailed features. Our method shows superior performance across several datasets. On the RETOUCH dataset, the network achieved F1 Scores of 0.82 for intraretinal fluid (IRF), 0.93 for subretinal fluid (SRF), and 0.94 for pigment epithelial detachment (PED). The model also performed well on the OPTIMA and DUKE datasets, demonstrating high precision, recall, and F1 Scores. This architecture significantly enhances segmentation accuracy and edge precision, offering a valuable tool for diagnosing and managing retinal diseases. Its integration of dual-input processing, multiscale attention, and advanced encoder modules highlights its potential to improve clinical outcomes and advance retinal disease treatment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Comparative Approach to De-Noising TEMPEST Video Frames.
- Author
-
Vizitiu, Alexandru Mădălin, Sandu, Marius Alexandru, Dobrescu, Lidia, Focșa, Adrian, and Molder, Cristian Constantin
- Subjects
- *
CONVOLUTIONAL neural networks , *OPTICAL character recognition , *SCIENTIFIC community , *COMPARATIVE method , *ADAPTIVE filters - Abstract
Analysis of unintended compromising emissions from Video Display Units (VDUs) is an important topic in research communities. This paper examines the feasibility of recovering the information displayed on the monitor from reconstructed video frames. The study holds particular significance for our understanding of security vulnerabilities associated with the electromagnetic radiation of digital displays. Considering the amount of noise that reconstructed TEMPEST video frames have, the work in this paper focuses on two different approaches to de-noising images for efficient optical character recognition. First, an Adaptive Wiener Filter (AWF) with adaptive window size implemented in the spatial domain was tested, and then a Convolutional Neural Network (CNN) with an encoder–decoder structure that follows both classical auto-encoder model architecture and U-Net architecture (auto-encoder with skip connections). These two techniques resulted in an improvement of more than two times on the Structural Similarity Index Metric (SSIM) for AWF and up to four times for the SSIM for the Deep Learning (DL) approach. In addition, to validate the results, the possibility of text recovery from processed noisy frames was studied using a state-of-the-art Tesseract Optical Character Recognition (OCR) engine. The present work aims to bring to attention the security importance of this topic and the non-negligible character of VDU information leakages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. An Automated Clubbed Fingers Detection System Based on YOLOv8 and U-Net: A Tool for Early Prediction of Lung and Cardiovascular Diseases.
- Author
-
Hsu, Wen-Shin, Liu, Guan-Tsen, Chen, Su-Juan, Wei, Si-Yu, and Wang, Wei-Hsun
- Subjects
- *
PROCESS capability , *LUNG diseases , *IMAGE segmentation , *DEEP learning , *CLOUD computing , *CARDIOVASCULAR diseases - Abstract
Background/Objectives: Lung and cardiovascular diseases are leading causes of mortality worldwide, yet early detection remains challenging due to the subtle symptoms. Digital clubbing, characterized by the bulbous enlargement of the fingertips, serves as an early indicator of these diseases. This study aims to develop an automated system for detecting digital clubbing using deep-learning models for real-time monitoring and early intervention. Methods: The proposed system utilizes the YOLOv8 model for object detection and U-Net for image segmentation, integrated with the ESP32-CAM development board to capture and analyze finger images. The severity of digital clubbing is determined using a custom algorithm based on the Lovibond angle theory, categorizing the condition into normal, mild, moderate, and severe. The system was evaluated using 1768 images and achieved cloud-based and real-time processing capabilities. Results: The system demonstrated high accuracy (98.34%) in real-time detection with precision (98.22%), sensitivity (99.48%), and specificity (98.22%). Cloud-based processing achieved slightly lower but robust results, with an accuracy of 96.38%. The average processing time was 0.15 s per image, showcasing its real-time potential. Conclusions: This automated system provides a scalable and cost-effective solution for the early detection of digital clubbing, enabling timely intervention for lung and cardiovascular diseases. Its high accuracy and real-time capabilities make it suitable for both clinical and home-based health monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. A "Region-Specific Model Adaptation (RSMA)"-Based Training Data Method for Large-Scale Land Cover Mapping.
- Author
-
Li, Congcong, Xian, George, and Jin, Suming
- Subjects
- *
MACHINE learning , *LAND cover , *HABITAT conservation , *BIOGEOCHEMICAL cycles , *DATABASES , *DEEP learning - Abstract
An accurate and historical land cover monitoring dataset for Alaska could provide fundamental information for a range of studies, such as conservation habitats, biogeochemical cycles, and climate systems, in this distinctive region. This research addresses challenges associated with the extraction of training data for timely and accurate land cover classifications in Alaska over longer time periods (e.g., greater than 10 years). Specifically, we designed the "Region-Specific Model Adaptation (RSMA)" method for training data. The method integrates land cover information from the National Land Cover Database (NLCD), LANDFIRE's Existing Vegetation Type (EVT), and the National Wetlands Inventory (NWI) and machine learning techniques to generate robust training samples based on the Anderson Level II classification legend. The assumption of the method is that spectral signatures vary across regions because of diverse land surface compositions; however, despite these variations, there are consistent, collective land cover characteristics that span the entire region. Building upon this assumption, this research utilized the classification power of deep learning algorithms and the generalization ability of RSMA to construct a model for the RSMA method. Additionally, we interpreted existing vegetation plot information for land cover labels as validation data to reduce inconsistency in the human interpretation. Our validation results indicate that the RSMA method improved the quality of the training data derived solely from the NLCD by approximately 30% for the overall accuracy. The validation assessment also demonstrates that the RSMA method can generate reliable training data on large scales in regions that lack sufficient reliable data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. A Transformer-Unet Generative Adversarial Network for the Super-Resolution Reconstruction of DEMs.
- Author
-
Zheng, Xin, Xu, Zhaoqi, Yin, Qian, Bao, Zelun, Chen, Zhirui, and Wang, Sizhu
- Subjects
- *
GENERATIVE adversarial networks , *DIGITAL elevation models , *ENVIRONMENTAL sciences , *GEOLOGY , *AGRICULTURE - Abstract
A new model called the Transformer-Unet Generative Adversarial Network (TUGAN) is proposed for super-resolution reconstruction of digital elevation models (DEMs). Digital elevation models are used in many fields, including environmental science, geology and agriculture. The proposed model uses a self-similarity Transformer (SSTrans) as the generator and U-Net as the discriminator. SSTrans, a model that we previously proposed, can yield good reconstruction results in structurally complex areas but has little advantage when the surface is simple and smooth because too many additional details have been added to the data. To resolve this issue, we propose the novel TUGAN model, where U-Net is capable of multilayer jump connections, which enables the discriminator to consider both global and local information when making judgments. The experiments show that TUGAN achieves state-of-the-art results for all types of terrain details. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Spreading anomaly semantic segmentation and 3D reconstruction of binder jet additive manufacturing powder bed images.
- Author
-
Gourley, Alexander, Kaufman, Jonathan, Aman, Bashu, Schwalbach, Edwin, Beuth, Jack, Rueschhoff, Lisa, and Reeja-Jayan, B.
- Subjects
- *
CONVOLUTIONAL neural networks , *CERAMIC powders , *MANUFACTURING processes , *IMAGE analysis , *RAPID tooling - Abstract
Variability in the inherently dynamic nature of additive manufacturing introduces imperfections that hinder the commercialization of new materials. Binder jetting produces ceramic and metallic parts, but low green densities and spreading anomalies reduce the predictability and processability of resulting geometries. In situ feedback presents a method for robust evaluation of spreading anomalies, reducing the number of required builds to refine processing parameters in a multivariate space. In this study, we report layer-wise powder bed semantic segmentation for the first time with a visually light ceramic powder, alumina, or Al2O3, leveraging an image analysis software to rapidly segment optical images acquired during the additive manufacturing process. Using preexisting image analysis tools allowed for rapid analysis of 316 stainless steel and alumina powders with small data sets by providing an accessible framework for implementing neural networks. Models trained on five build layers for each material to classify base powder, parts, streaking, short spreading, and bumps from recoater friction with testing categorical accuracies greater than 90%. Lower model performance accompanied the more subtle spreading features present in the white alumina compared to the darker steel. Applications of models to new builds demonstrated repeatability with the resulting models, and trends in classified pixels reflected corrections made to processing parameters. Through the development of robust analysis techniques and feedback for new materials, parameters can be corrected as builds progress. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Prediction of carcass rib eye area by ultrasound images in sheep using computer vision.
- Author
-
Júnior, Francisco Albir Lima, Filho, Luiz Antônio Silva Figueiredo, de Sousa Júnior, Antônio, Silva, Romuere Rodrigues Veloso e., Barbosa, Bruna Lima, de Brito Vieira, Rafaela, Rocha, Artur Oliveira, de Moura Oliveira, Tiago, and Sarmento, José Lindenberg Rocha
- Subjects
- *
ULTRASONIC imaging , *COMPUTER vision , *RANDOM forest algorithms , *SHEEP , *AREA measurement - Abstract
The present research created a tool to measure ultrasound images of the rib eye area in sheep. One hundred twenty-one ultrasound images of sheep were captured, with regions of interest segmented using the U-Net algorithm. The metrics adopted to evaluate automatic segmentations were Dicescore and intersection over union. Finally, a regression analysis was performed using the AdaBoost Regressor and Random Forest Regressor algorithms and the fit of the models was evaluated using the Mean Square Residuals, mean absolute error and coefficient of determination. The values obtained for the Dice metric were 0.94, and for Intersection over Union it was 0.89, demonstrating a high similarity between the actual and predicted values, ranging from 0 to 1. The values of Mean Quadratic Residuals, mean absolute error and coefficient The determination of the regressor models indicated the best fit for the Random Forest Regressor. The U-Net algorithm efficiently segmented ultrasound images of the Longissimus Dorsi muscle, with greater precision than the measurements performed by the specialist. This efficient segmentation allowed the standardization of rib eye area measurements and, consequently, the phenotyping of beef sheep on a large scale. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. ENSEMBLE LEARNING-BASED AUTOMATIC DETECTION OF LANDSLIDE AREAS FROM AERIAL PHOTOGRAPHS.
- Author
-
Opara, Jonpaul Nnamdi, Moriwaki, Ryo, and Pang-jo Chun
- Abstract
Landslides pose a significant threat to human life and property worldwide. Japan, with its vulnerability to these natural disasters, records a high incidence of landslides. The Geospatial Information Authority of Japan employs experts to visually examine aerial photographs before and after landslide events, a costly and time-consuming approach that can limit accuracy. This study aims to aid in mitigating the damage caused by landslides through accurate and efficient mapping and prediction. An Ensemble U-Net model integrating three U-Nets has been proposed to predict landslide areas from aerial photographs. Comparative analysis with a single U-Net model revealed that the Ensemble model significantly outperformed the single model in all accuracy measures, including precision, recall, and F1-score. The ensemble model's average intersection over union (IoU) value of 0.80 also indicated a stronger agreement between the predicted outcome and ground truth than the single U-Net model. Visual analysis of prediction results further demonstrated the superiority of the ensemble model in aligning closely with the ground truth, thereby reducing misidentification and missed detections. The proposed Ensemble U-Net model's potential to enhance the accuracy and efficiency of landslide mapping seems promising. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Study on Fractal Damage of Concrete Cracks Based on U-Net.
- Author
-
Xie, Ming, Wang, Zhangdong, Yin, Li'e, Xu, Fangbo, Wu, Xiangdong, and Xu, Mengqi
- Subjects
CONVOLUTIONAL neural networks ,REINFORCED concrete ,CRACKING of concrete ,FRACTAL dimensions ,IMAGE processing - Abstract
The damage degree of a reinforced concrete structure is closely related to the generation and expansion of cracks. However, the traditional damage assessment methods of reinforced concrete structures have defects, including low efficiency of crack detection, low accuracy of crack extraction, and dependence on the experience of inspectors to evaluate the damage of structures. Because of the above problems, this paper proposes a damage assessment method for concrete members combining the U-Net convolutional neural network and crack fractal features. Firstly, the collected test crack images are input into U-Net for segmenting and extracting the output cracks. The damage to the concrete structure is then classified into four empirical levels according to the damage index (DI). Subsequently, a linear regression equation is constructed between the fractal dimension (D) of the cracks and the damage index (DI) of the reinforced concrete members. The damage assessment is then performed by predicting the damage index using linear regression. The method was subsequently employed to predict the damage level of a reinforced concrete shear wall–beam combination specimen, which was then compared with the actual damage level. The results demonstrate that the damage assessment method for concrete members proposed in this study is capable of effectively identifying the damage degree of the concrete members, indicating that the method is both robust and generalizable. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Segmentation of Glacier Area Using U-Net through Landsat Satellite Imagery for Quantification of Glacier Recession and Its Impact on Marine Systems.
- Author
-
Robbins, Edmund, Breininger, Robert D., Jiang, Maxwell, Madera, Michelle, White, Ryan T., and Kachouie, Nezamoddin N.
- Subjects
LANDSAT satellites ,REMOTE-sensing images ,LAND cover ,CLIMATE change ,SURFACE area - Abstract
Glaciers have experienced a global trend of recession within the past century. Quantification of glacier variations using satellite imagery has been of great interest due to the importance of glaciers as freshwater resources and as indicators of climate change. Spatiotemporal glacier dynamics must be monitored to quantify glacier variations. The potential methods to quantify spatiotemporal glacier dynamics with increasing complexity levels include detecting the terminus location, measuring the length of the glacier from the accumulation zone to the terminus, quantifying the glacier surface area, and measuring glacier volume. Although some deep learning methods designed purposefully for glacier boundary segmentation have achieved acceptable results, these models are often localized to the region where their training data were acquired and further rely on the training sets that were often curated manually to highlight glacial regions. Due to the very large number of glaciers, it is practically impossible to perform a worldwide study of glacier dynamics using manual methods. As a result, an automated or semi-automated method is highly desirable. The current study has built upon our previous works moving towards identification methods of the 2D glacier profile for glacier area segmentation. In this study, a deep learning method is proposed for segmentation of temporal Landsat images to quantify the glacial region within the Mount Cook/Aoraki massif located in the Southern Alps/Kā Tiritiri o te Moana of New Zealand/Aotearoa. Segmented glacial regions can be further utilized to determine the relationship of their variations due to climate change. This model has demonstrated promising performance while trained on a relatively small dataset. The permanent ice and snow class was accurately segmented at a 92% rate by the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Enhancing Brain Tumor Diagnosis with L-Net: A Novel Deep Learning Approach for MRI Image Segmentation and Classification.
- Author
-
Dénes-Fazakas, Lehel, Kovács, Levente, Eigner, György, and Szilágyi, László
- Subjects
CONVOLUTIONAL neural networks ,CANCER diagnosis ,PITUITARY tumors ,IMAGE recognition (Computer vision) ,MAGNETIC resonance imaging ,BRAIN tumors - Abstract
Background: Brain tumors are highly complex, making their detection and classification a significant challenge in modern medical diagnostics. The accurate segmentation and classification of brain tumors from MRI images are crucial for effective treatment planning. This study aims to develop an advanced neural network architecture that addresses these challenges. Methods: We propose L-net, a novel architecture combining U-net for tumor boundary segmentation and a convolutional neural network (CNN) for tumor classification. These two units are coupled such a way that the CNN classifies the MRI images based on the features extracted by the U-net while segmenting the tumor, instead of relying on the original input images. The model is trained on a dataset of 3064 high-resolution MRI images, encompassing gliomas, meningiomas, and pituitary tumors, ensuring robust performance across different tumor types. Results: L-net achieved a classification accuracy of up to 99.6%, surpassing existing models in both segmentation and classification tasks. The model demonstrated effectiveness even with lower image resolutions, making it suitable for diverse clinical settings. Conclusions: The proposed L-net model provides an accurate and unified approach to brain tumor segmentation and classification. Its enhanced performance contributes to more reliable and precise diagnosis, supporting early detection and treatment in clinical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. 融合Partial 卷积与残差细化的遥感影像建筑物提取算法.
- Author
-
侯佳兴, 齐向明, 郝明, and 张进
- Abstract
Copyright of Journal of Frontiers of Computer Science & Technology is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
27. A time-frequency fusion model for multi-channel speech enhancement.
- Author
-
Zeng, Xiao, Xu, Shiyun, and Wang, Mingjiang
- Subjects
ARTIFICIAL neural networks ,SPEECH enhancement ,FEATURE extraction ,TIMEKEEPING - Abstract
Multi-channel speech enhancement plays a critical role in numerous speech-related applications. Several previous works explicitly utilize deep neural networks (DNNs) to exploit tempo-spectral signal characteristics, which often leads to excellent performance. In this work, we present a time-frequency fusion model, namely TFFM, for multi-channel speech enhancement. We utilize three cascaded U-Nets to capture three types of high-resolution features, aiming to investigate their individual contributions. To be specific, the first U-Net keeps the time dimension and performs feature extraction along the frequency dimension for the high-resolution spectral features with global temporal information, the second U-Net keeps the frequency dimension and extracts features along the time dimension for the high-resolution temporal features with global spectral information, and the third U-Net downsamples and upsamples along both the frequency and time dimensions for the high-resolution tempo-spectral features. These three cascaded U-Nets are designed to aggregate local and global features, thereby effectively handling the tempo-spectral information of speech signals. The proposed TFFM in this work outperforms state-of-the-art baselines. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Retinex decomposition based low‐light image enhancement by integrating Swin transformer and U‐Net‐like architecture.
- Author
-
Wang, Zexin, Qingge, Letu, Pan, Qingyi, and Yang, Pei
- Subjects
- *
TRANSFORMER models , *IMAGE intensifiers , *VISUAL perception , *REFLECTANCE , *TEST methods - Abstract
Low‐light images are captured in environments with minimal lighting, such as nighttime or underwater conditions. These images often suffer from issues like low brightness, poor contrast, lack of detail, and overall darkness, significantly impairing human visual perception and subsequent high‐level visual tasks. Enhancing low‐light images holds great practical significance. Among the various existing methods for Low‐Light Image Enhancement (LLIE), those based on the Retinex theory have gained significant attention. However, despite considerable efforts in prior research, the challenge of Retinex decomposition remains unresolved. In this study, an LLIE network based on the Retinex theory is proposed, which addresses these challenges by integrating attention mechanisms and a U‐Net‐like architecture. The proposed model comprises three modules: the Decomposition module (DECM), the Reflectance Recovery module (REFM), and the Illumination Enhancement module (ILEM). Its objective is to decompose low‐light images based on the Retinex theory and enhance the decomposed reflectance and illumination maps using attention mechanisms and a U‐Net‐like architecture. We conducted extensive experiments on several widely used public datasets. The qualitative results demonstrate that the approach produces enhanced images with superior visual quality compared to the existing methods on all test datasets, especially for some extremely dark images. Furthermore, the quantitative evaluation results based on metrics PSNR, SSIM, LPIPS, BRISQUE, and MUSIQ show the proposed model achieves superior performance, with PSNR and BRISQUE significantly outperforming the baseline approaches, where (PSNR, mean BRISQUE) values of the proposed method and the second best results are (17.14, 17.72) and (16.44, 19.65). Additionally, further experimental results such as ablation studies indicate the effectiveness of the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. RDAG U-Net: An Advanced AI Model for Efficient and Accurate CT Scan Analysis of SARS-CoV-2 Pneumonia Lesions.
- Author
-
Lee, Chih-Hui, Pan, Cheng-Tang, Lee, Ming-Chan, Wang, Chih-Hsuan, Chang, Chun-Yung, and Shiue, Yow-Ling
- Subjects
- *
IMAGE analysis , *ARTIFICIAL intelligence , *LUNG diseases , *COMPUTED tomography , *RESPIRATORY infections - Abstract
Background/Objective: This study aims to utilize advanced artificial intelligence (AI) image recog-nition technologies to establish a robust system for identifying features in lung computed tomog-raphy (CT) scans, thereby detecting respiratory infections such as SARS-CoV-2 pneumonia. Spe-cifically, the research focuses on developing a new model called Residual-Dense-Attention Gates U-Net (RDAG U-Net) to improve accuracy and efficiency in identification. Methods: This study employed Attention U-Net, Attention Res U-Net, and the newly developed RDAG U-Net model. RDAG U-Net extends the U-Net architecture by incorporating ResBlock and DenseBlock modules in the encoder to retain training parameters and reduce computation time. The training dataset in-cludes 3,520 CT scans from an open database, augmented to 10,560 samples through data en-hancement techniques. The research also focused on optimizing convolutional architectures, image preprocessing, interpolation methods, data management, and extensive fine-tuning of training parameters and neural network modules. Result: The RDAG U-Net model achieved an outstanding accuracy of 93.29% in identifying pulmonary lesions, with a 45% reduction in computation time compared to other models. The study demonstrated that RDAG U-Net performed stably during training and exhibited good generalization capability by evaluating loss values, model-predicted lesion annotations, and validation-epoch curves. Furthermore, using ITK-Snap to convert 2D pre-dictions into 3D lung and lesion segmentation models, the results delineated lesion contours, en-hancing interpretability. Conclusion: The RDAG U-Net model showed significant improvements in accuracy and efficiency in the analysis of CT images for SARS-CoV-2 pneumonia, achieving a 93.29% recognition accuracy and reducing computation time by 45% compared to other models. These results indicate the potential of the RDAG U-Net model in clinical applications, as it can accelerate the detection of pulmonary lesions and effectively enhance diagnostic accuracy. Additionally, the 2D and 3D visualization results allow physicians to understand lesions' morphology and distribution better, strengthening decision support capabilities and providing valuable medical diagnosis and treatment planning tools. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Snow Cover Extraction from Landsat 8 OLI Based on Deep Learning with Cross-Scale Edge-Aware and Attention Mechanism.
- Author
-
Yu, Zehao, Gong, Hanying, Zhang, Shiqiang, and Wang, Wei
- Subjects
- *
WATER management , *OPTICAL remote sensing , *LANDSAT satellites , *REMOTE sensing , *DEEP learning , *SNOW cover - Abstract
Snow cover distribution is of great significance for climate change and water resource management. Current deep learning-based methods for extracting snow cover from remote sensing images face challenges such as insufficient local detail awareness and inadequate utilization of global semantic information. In this study, a snow cover extraction algorithm integrating cross-scale edge perception and an attention mechanism on the U-net model architecture is proposed. The cross-scale edge perception module replaces the original jump connection of U-net, enhances the low-level image features by introducing edge detection on the shallow feature scale, and enhances the detail perception via branch separation and fusion features on the deep feature scale. Meanwhile, parallel channel and spatial attention mechanisms are introduced in the model encoding stage to adaptively enhance the model's attention to key features and improve the efficiency of utilizing global semantic information. The method was evaluated on the publicly available CSWV_S6 optical remote sensing dataset, and the accuracy of 98.14% indicates that the method has significant advantages over existing methods. Snow extraction from Landsat 8 OLI images of the upper reaches of the Irtysh River was achieved with satisfactory accuracy rates of 95.57% (using two, three, and four bands) and 96.65% (using two, three, four, and six bands), indicating its strong potential for automated snow cover extraction over larger areas. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. A Comparison of Local and Global Strategies for Exploiting Field Inversion on Separated Flows at Low Reynolds Number.
- Author
-
Muscarà, Luca, Cisternino, Marco, Ferrero, Andrea, Iob, Andrea, and Larocca, Francesco
- Subjects
REYNOLDS number ,MACHINE learning ,PROBLEM solving ,AEROFOILS ,FORECASTING - Abstract
The prediction of separated flows at low Reynolds numbers is crucial for several applications in aerospace and energy fields. Reynolds-averaged Navier–Stokes (RANS) equations are widely used but their accuracy is limited in the presence of transition or separation. In this work, two different strategies for improving RANS simulations by means of field inversion are discussed. Both strategies require solving an optimization problem to identify a correction field by minimizing the error on some measurable data. The obtained correction field is exploited with two alternative strategies. The first strategy aims to the identification of a relation that allows to express the local correction field as a function of some local flow features. However, this regression can be difficult or even impossible because the relation between the assumed input variables and the local correction could not be a function. For this reason, an alternative is proposed: a U-Net model is trained on the original and corrected RANS results. In this way, it is possible to perform a prediction with the original RANS model and then correct it by means of the U-Net. The methodologies are evaluated and compared on the flow around the NACA0021 and the SD7003 airfoils. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s.
- Author
-
Chu, Yonghua, Xu, Jinyang, Wu, Chunshuang, Ye, Jianping, Zhang, Jucheng, Shen, Lei, Wang, Huaxia, and Yao, Yudong
- Subjects
OPTIC nerve ,IMAGE segmentation ,ULTRASONIC imaging ,TRANSFORMER models ,FEATURE extraction ,DEEP learning - Abstract
The diameter of the optic nerve sheath is an important indicator for assessing the intracranial pressure in critically ill patients. The methods for measuring the optic nerve sheath diameter are generally divided into invasive and non-invasive methods. Compared to the invasive methods, the non-invasive methods are safer and have thus gained popularity. Among the non-invasive methods, using deep learning to process the ultrasound images of the eyes of critically ill patients and promptly output the diameter of the optic nerve sheath offers significant advantages. This paper proposes a CBC-YOLOv5s optic nerve sheath ultrasound image segmentation method that integrates both local and global features. First, it introduces the CBC-Backbone feature extraction network, which consists of dual-layer C3 Swin-Transformer (C3STR) and dual-layer Bottleneck Transformer (BoT3) modules. The C3STR backbone's multi-layer convolution and residual connections focus on the local features of the optic nerve sheath, while the Window Transformer Attention (WTA) mechanism in the C3STR module and the Multi-Head Self-Attention (MHSA) in the BoT3 module enhance the model's understanding of the global features of the optic nerve sheath. The extracted local and global features are fully integrated in the Spatial Pyramid Pooling Fusion (SPPF) module. Additionally, the CBC-Neck feature pyramid is proposed, which includes a single-layer C3STR module and three-layer CReToNeXt (CRTN) module. During upsampling feature fusion, the C3STR module is used to enhance the local and global awareness of the fused features. During downsampling feature fusion, the CRTN module's multi-level residual design helps the network to better capture the global features of the optic nerve sheath within the fused features. The introduction of these modules achieves the thorough integration of the local and global features, enabling the model to efficiently and accurately identify the optic nerve sheath boundaries, even when the ocular ultrasound images are blurry or the boundaries are unclear. The Z2HOSPITAL-5000 dataset collected from Zhejiang University Second Hospital was used for the experiments. Compared to the widely used YOLOv5s and U-Net algorithms, the proposed method shows improved performance on the blurry test set. Specifically, the proposed method achieves precision, recall, and Intersection over Union (IoU) values that are 4.1%, 2.1%, and 4.5% higher than those of YOLOv5s. When compared to U-Net, the precision, recall, and IoU are improved by 9.2%, 21%, and 19.7%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. U-DeepONet: U-Net enhanced deep operator network for geologic carbon sequestration.
- Author
-
Diab, Waleed and Al Kobaisi, Mohammed
- Subjects
- *
ARTIFICIAL neural networks , *GEOLOGICAL carbon sequestration , *POROUS materials , *TWO-phase flow , *SCIENCE education , *SCIENTIFIC computing - Abstract
Learning operators with deep neural networks is an emerging paradigm for scientific computing. Deep Operator Network (DeepONet) is a modular operator learning framework that allows for flexibility in choosing the kind of neural network to be used in the trunk and/or branch of the DeepONet. This is beneficial as it has been shown many times that different types of problems require different kinds of network architectures for effective learning. In this work, we design an efficient neural operator based on the DeepONet architecture. We introduce U-Net enhanced DeepONet (U-DeepONet) for learning the solution operator of highly complex CO2-water two-phase flow in heterogeneous porous media. The U-DeepONet is more accurate in predicting gas saturation and pressure buildup than the state-of-the-art U-Net based Fourier Neural Operator (U-FNO) and the Fourier-enhanced Multiple-Input Operator (Fourier-MIONet) trained on the same dataset. Moreover, our U-DeepONet is significantly more efficient in training times than both the U-FNO (more than 18 times faster) and the Fourier-MIONet (more than 5 times faster), while consuming less computational resources. We also show that the U-DeepONet is more data efficient and better at generalization than both the U-FNO and the Fourier-MIONet. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. A Segmentation-Based Automated Corneal Ulcer Grading System for Ocular Staining Images Using Deep Learning and Hough Circle Transform.
- Author
-
Manawongsakul, Dulyawat and Patanukhom, Karn
- Subjects
- *
CONVOLUTIONAL neural networks , *HOUGH transforms , *IMAGE processing , *IMAGE segmentation , *DEEP learning ,CORNEAL ulcer - Abstract
Corneal ulcer is a prevalent ocular condition that requires ophthalmologists to diagnose, assess, and monitor symptoms. During examination, ophthalmologists must identify the corneal ulcer area and evaluate its severity by manually comparing ocular staining images with severity indices. However, manual assessment is time-consuming and may provide inconsistent results. Variations can occur with repeated evaluations of the same images or with grading among different evaluators. To address this problem, we propose an automated corneal ulcer grading system for ocular staining images based on deep learning techniques and the Hough Circle Transform. The algorithm is structured into two components for cornea segmentation and corneal ulcer segmentation. Initially, we apply a deep learning method combined with the Hough Circle Transform to segment cornea areas. Subsequently, we develop the corneal ulcer segmentation model using deep learning methods. In this phase, the predicted cornea areas are utilized as masks for training the corneal ulcer segmentation models during the learning phase. Finally, this algorithm uses the results from these two components to determine two outputs: (1) the percentage of the ulcerated area on the cornea, and (2) the severity degree of the corneal ulcer based on the Type–Grade (TG) grading standard. These methodologies aim to enhance diagnostic efficiency across two key aspects: (1) ensuring consistency by delivering uniform and dependable results, and (2) enhancing robustness by effectively handling variations in eye size. In this research, our proposed method is evaluated using the SUSTech-SYSU public dataset, achieving an Intersection over Union of 89.23% for cornea segmentation and 82.94% for corneal ulcer segmentation, along with a Mean Absolute Error of 2.51% for determining the percentage of the ulcerated area on the cornea and an Accuracy of 86.15% for severity grading. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. A Comparative Analysis of U-Net and Vision Transformer Architectures in Semi-Supervised Prostate Zonal Segmentation.
- Author
-
Huang, Guantian, Xia, Bixuan, Zhuang, Haoming, Yan, Bohan, Wei, Cheng, Qi, Shouliang, Qian, Wei, and He, Dianning
- Subjects
- *
TRANSFORMER models , *DIAGNOSTIC imaging , *AUTODIDACTICISM , *TIME-varying networks , *ENTROPY - Abstract
The precise segmentation of different regions of the prostate is crucial in the diagnosis and treatment of prostate-related diseases. However, the scarcity of labeled prostate data poses a challenge for the accurate segmentation of its different regions. We perform the segmentation of different regions of the prostate using U-Net- and Vision Transformer (ViT)-based architectures. We use five semi-supervised learning methods, including entropy minimization, cross pseudo-supervision, mean teacher, uncertainty-aware mean teacher (UAMT), and interpolation consistency training (ICT) to compare the results with the state-of-the-art prostate semi-supervised segmentation network uncertainty-aware temporal self-learning (UATS). The UAMT method improves the prostate segmentation accuracy and provides stable prostate region segmentation results. ICT plays a more stable role in the prostate region segmentation results, which provides strong support for the medical image segmentation task, and demonstrates the robustness of U-Net for medical image segmentation. UATS is still more applicable to the U-Net backbone and has a very significant effect on a positive prediction rate. However, the performance of ViT in combination with semi-supervision still requires further optimization. This comparative analysis applies various semi-supervised learning methods to prostate zonal segmentation. It guides future prostate segmentation developments and offers insights into utilizing limited labeled data in medical imaging. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Scale- and Resolution-Adapted Shaded Relief Generation Using U-Net.
- Author
-
Farmakis-Serebryakova, Marianna, Heitzler, Magnus, and Hurni, Lorenz
- Subjects
- *
DIGITAL elevation models , *WEB-based user interfaces , *WEB design , *MACHINE learning , *EQUITABLE remedies (Law) , *HISTOGRAMS - Abstract
On many maps, relief shading is one of the most significant graphical elements. Modern relief shading techniques include neural networks. To generate such shading automatically at an arbitrary scale, one needs to consider how the resolution of the input digital elevation model (DEM) relates to the neural network process and the maps used for training. Currently, there is no clear guidance on which DEM resolution to use to generate relief shading at specific scales. To address this gap, we trained the U-Net models on swisstopo manual relief shadings of Switzerland at four different scales and using four different resolutions of SwissALTI3D DEM. An interactive web application designed for this study allows users to outline a random area and compare histograms of varying brightness between predictions and manual relief shadings. The results showed that DEM resolution and output scale influence the appearance of the relief shading, with an overall scale/resolution ratio. We present guidelines for generating relief shading with neural networks for arbitrary areas and scales. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. A Hybrid Method for Multiple Sclerosis Lesion Segmentation Using Wavelet and Dense U-Net.
- Author
-
Alijamaat, Ali, Mirhosseini, Seyed Mohsen, and Aliakbari, Reyhaneh
- Subjects
- *
CENTRAL nervous system , *MULTIPLE sclerosis , *IMAGE processing , *WHITE matter (Nerve tissue) , *WAVELET transforms - Abstract
Multiple Sclerosis (MS) is one of the debilitating disorders of the central nervous system. This disease causes lesions in the white matter of the brain tissue. It can also lead to many physical and psychological disorders in movement, vision, and memory. Lesion segmentation in MRI images to determine the number and size of lesions is one of the diagnostic problems for specialists. Using automated diagnostic tools as an aid can help professionals. Traditional image processing and deep learning methods are used to automate lesion segmentation. The U-Net is one of the most widely used deep learning architectures for MS lesion segmentation. The images are used in the Fourier domain in the U-Net network, which does not include all its features. Our proposed method combines the HAR wavelet transform and the Dense net-based U-Net. This makes local features and lesions of different sizes more prominent and leads to higher quality segmentation. The proposed method had a better Dice value than the compared methods in the experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. A Multi-Scale Liver Tumor Segmentation Method Based on Residual and Hybrid Attention Enhanced Network with Contextual Integration.
- Author
-
Sun, Liyan, Jiang, Linqing, Wang, Mingcong, Wang, Zhenyan, and Xin, Yi
- Subjects
- *
FEATURE extraction , *LIVER tumors , *PARALLEL processing , *LIVER cancer , *DEATH rate - Abstract
Liver cancer is one of the malignancies with high mortality rates worldwide, and its timely detection and accurate diagnosis are crucial for improving patient prognosis. To address the limitations of traditional image segmentation techniques and the U-Net network in capturing fine image features, this study proposes an improved model based on the U-Net architecture, named RHEU-Net. By replacing traditional convolution modules in the encoder and decoder with improved residual modules, the network's feature extraction capabilities and gradient stability are enhanced. A Hybrid Gated Attention (HGA) module is integrated before the skip connections, enabling the parallel processing of channel and spatial attentions, optimizing the feature fusion strategy, and effectively replenishing image details. A Multi-Scale Feature Enhancement (MSFE) layer is introduced at the bottleneck, utilizing multi-scale feature extraction technology to further enhance the expression of receptive fields and contextual information, improving the overall feature representation effect. Testing on the LiTS2017 dataset demonstrated that RHEU-Net achieved Dice scores of 95.72% for liver segmentation and 70.19% for tumor segmentation. These results validate the effectiveness of RHEU-Net and underscore its potential for clinical application. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Titanium Alloy Weld Time-of-Flight Diffraction Image Denoising Based on a Wavelet Feature Fusion Deep-Learning Model.
- Author
-
Zhi, Zelin, Jiang, Hongquan, Yang, Deyan, Yue, Kun, Gao, Jianmin, Cheng, Zhixiang, Xu, Yongjun, Geng, Qiang, and Zhou, Wei
- Subjects
- *
IMAGE denoising , *WELDED joints , *WELDING , *NONDESTRUCTIVE testing , *IMAGE fusion , *TITANIUM alloys - Abstract
Images of titanium alloy welds detected by time-of-flight diffraction (TOFD) have problems, including large noise signals and many interference streaks around the defects, all of which seriously limit the accuracy and effectiveness of defect recognition. Existing image denoising methods lack the knowledge of the noise characteristics of TOFD images of titanium alloy weld and the preprocessing experience of technicians in the field. In addition, it is difficult to select the parameters of the preprocessing methods, and they are easily influenced by the level of technical personnel, resulting in low efficiency and poor consistency in preprocessing. To address these problems, we proposed a denoising method based on the combination of wavelet band features and deep-learning theory for TOFD images of titanium alloy weld. First, based on the wavelet preprocessing method and the experience of nondestructive testing (NDT) technicians, we constructed an image pair dataset consisting of the original TOFD images of titanium alloy weld and the desired target images to realize the accumulation of engineers' preprocessing knowledge. Second, we constructed a multiband wavelet feature fusion U-net image denoising model (WU-net) and designed a loss function under three constraints of image consistency, image texture information consistency, and structural similarity. This model was able to learn to achieve end-to-end adaptive denoising for TOFD images of titanium alloy weld. Third, we illustrated and validated the effectiveness of TOFD image preprocessing for titanium alloy weld. The results showed that the proposed method effectively eliminated TOFD image noise and improved the accuracy of defect recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. CST-UNet: Cross Swin Transformer Enhanced U-Net with Masked Bottleneck for Single-Channel Speech Enhancement.
- Author
-
Zhang, Zipeng, Chen, Wei, Guo, Weiwei, Liu, Yiming, Yang, Jianhua, and Liu, Houguang
- Subjects
- *
SPEECH enhancement , *TRANSFORMER models , *COMPUTATIONAL complexity , *CORPORA , *DEEP learning - Abstract
Speech enhancement performance has improved significantly with the introduction of deep learning models, especially methods based on the Long–Short-Term Memory architecture. However, these methods face challenges such as high computational complexity and redundancy of input features. To address these issues, we propose a U-Net-based approach that utilizes an encoder/decoder to extract more concise features, thereby enhancing single-channel speech performance and reducing computation complexity. The proposed method includes a Cross-Swin-Transformer block and a masked bottleneck module, which down-samples features while preserving the detailed representation through skip connections and carefully designed blocks. The bottleneck module extracts coarse representations of hidden features as masks. We evaluated our method against other U-Net-based approaches on VCTK and DNS corpora using CBAK, eSTOI, PESQ, STOI, and SI-SDR metrics. The results demonstrate that the proposed method achieves promising performance while significantly reducing computational complexity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Automated shoreline extraction process for unmanned vehicles via U-net with heuristic algorithm.
- Author
-
Prokop, Katarzyna, Połap, Dawid, Włodarczyk-Sielicka, Marta, Połap, Karolina, Jaszcz, Antoni, and Stateczny, Andrzej
- Subjects
HEURISTIC algorithms ,DATABASES ,GEOGRAPHIC boundaries ,IMAGE processing ,REAL estate development - Abstract
Detecting the shoreline is an important task for its potential use. The shoreline allows cropping of the image into two separate areas that present the water area and the shore. It is particularly interesting because the images can be used to analyze pollution, land development, or even waterfront erosion. Unfortunately, automatic shoreline detection is a complex problem due to numerous physical and atmospheric issues. In this paper, we present a solution based on a U-net convolutional network, that is trained to shoreline detection on a dedicated database. The database is automatically generated by applying image processing techniques and a heuristic algorithm. Using heuristics, optimal values of mask generation parameters are determined. Consequently, the solution allows for the automation of generating a set of masks by analyzing the boundary line and the efficiency of the segmentation network. The proposed solution allows for the analysis of the coastline, where potential obstacles and even occurring waves can be quickly detected. To evaluate the proposed solution, tests were carried out in real conditions, which showed the effectiveness of the model. In addition, tests were carried out on a publicly available database, which allowed for obtaining higher results than existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Ocean Currents Velocity Hindcast and Forecast Bias Correction Using a Deep-Learning Approach.
- Author
-
Muhamed Ali, Ali, Zhuang, Hanqi, Huang, Yu, Ibrahim, Ali K., Altaher, Ali Salem, and Chérubin, Laurent M.
- Subjects
OCEAN currents ,OCEAN dynamics ,NUMERICAL calculations ,CURRENT transformers (Instrument transformer) ,DEEP learning - Abstract
Today's prediction of ocean dynamics relies on numerical models. However, numerical models are often unable to accurately model and predict real ocean dynamics, leading to a lack of fulfillment of a range of services that require reliable predictions at various temporal and spatial scales. Indeed, a numerical model cannot fully resolve all the physical processes in the ocean due to various reasons, including biases in the initial field and calculation errors in the numerical solution of the model. Thus, bias-correcting methods have become crucial to improve the dynamical accuracy of numerical model predictions. In this study, we present a machine learning-based three-dimensional velocity bias correction method derived from historical observations that applies to both hindcast and forecast. Our approach is based on the modification of an existing deep learning model, called U-Net, designed specifically for image segmentation analysis in the biomedical field. U-Net was modified to create a Transform Model that retains the temporal and spatial evolution of the differences between the model and observations to produce a correction in the form of regression weights that evolves spatially and temporally with the model both forward and backward in time, beyond the observation period. Using daily ocean current observations from a 2.5-year current meter array deployment, we show that significant bias corrections can be conducted up to 50 days pre- or post-observations. Using a 3-year-long virtual array, valid bias corrections can be conducted for up to one year. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. A Deep Learning Strategy for the Retrieval of Sea Wave Spectra from Marine Radar Data.
- Author
-
Ludeno, Giovanni, Esposito, Giuseppe, Lugni, Claudio, Soldovieri, Francesco, and Gennarelli, Gianluca
- Subjects
CONVOLUTIONAL neural networks ,OCEAN waves ,TRANSFER functions ,FAST Fourier transforms ,DEEP learning - Abstract
In the context of sea state monitoring, reconstructing the wave field and estimating the sea state parameters from radar data is a challenging problem. To reach this goal, this paper proposes a fully data-driven, deep learning approach based on a convolutional neural network. The network takes as input the radar image spectrum and outputs the sea wave directional spectrum. After a 2D fast Fourier transform, the wave elevation field is reconstructed, and accordingly, the sea state parameters are estimated. The reconstruction strategy, herein presented, is tested using numerical data generated from a synthetic sea wave simulator, considering the spectral proprieties of the Joint North Sea Wave Observation Project model. A performance analysis of the proposed deep-learning estimation strategy is carried out, along with a comparison to the classical modulation transfer function approach. The results demonstrate that the proposed approach is effective in reconstructing the directional wave spectrum across different sea states. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. A Multidimensional Framework Incorporating 2D U-Net and 3D Attention U-Net for the Segmentation of Organs from 3D Fluorodeoxyglucose-Positron Emission Tomography Images.
- Author
-
Vezakis, Andreas, Vezakis, Ioannis, Vagenas, Theodoros P., Kakkos, Ioannis, and Matsopoulos, George K.
- Subjects
CONVOLUTIONAL neural networks ,ANATOMICAL planes ,POSITRON emission tomography ,HEART ventricles ,DEEP learning - Abstract
Accurate analysis of Fluorodeoxyglucose (FDG)-Positron Emission Tomography (PET) images is crucial for the diagnosis, treatment assessment, and monitoring of patients suffering from various cancer types. FDG-PET images provide valuable insights by revealing regions where FDG, a glucose analog, accumulates within the body. While regions of high FDG uptake include suspicious tumor lesions, FDG also accumulates in non-tumor-specific regions and organs. Identifying these regions is crucial for excluding them from certain measurements, or calculating useful parameters, for example, the mean standardized uptake value (SUV) to assess the metabolic activity of the liver. Manual organ delineation from FDG-PET by clinicians demands significant effort and time, which is often not feasible in real clinical workflows with high patient loads. For this reason, this study focuses on automatically identifying key organs with high FDG uptake, namely the brain, left cardiac ventricle, kidneys, liver, and bladder. To this end, an ensemble approach is adopted, where a three-dimensional Attention U-Net (3D AU-Net) is employed for robust three-dimensional analysis, while a two-dimensional U-Net (2D U-Net) is utilized for analysis in the coronal plane. The 3D AU-Net demonstrates highly detailed organ segmentations, but also includes many false positive regions. In contrast, 2D U-Net achieves higher reliability with minimal false positive regions, but lacks the 3D details. Experiments conducted on a subset of the public AutoPET dataset with 60 PET scans demonstrate that the proposed ensemble model achieves high accuracy in segmenting the required organs, surpassing current state-of-the-art techniques, and supporting the potential utilization of the proposed methodology in accelerating and enhancing the clinical workflow of cancer patients. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Deep Learning-Based Workflow for Bone Segmentation and 3D Modeling in Cone-Beam CT Orthopedic Imaging.
- Author
-
Tiribilli, Eleonora and Bocchi, Leonardo
- Subjects
CONVOLUTIONAL neural networks ,CONE beam computed tomography ,COMPUTED tomography ,GRAPH algorithms ,USER interfaces - Abstract
In this study, a deep learning-based workflow designed for the segmentation and 3D modeling of bones in cone beam computed tomography (CBCT) orthopedic imaging is presented. This workflow uses a convolutional neural network (CNN), specifically a U-Net architecture, to perform precise bone segmentation even in challenging anatomical regions such as limbs, joints, and extremities, where bone boundaries are less distinct and densities are highly variable. The effectiveness of the proposed workflow was evaluated by comparing the generated 3D models against those obtained through other segmentation methods, including SegNet, binary thresholding, and graph cut algorithms. The accuracy of these models was quantitatively assessed using the Jaccard index, the Dice coefficient, and the Hausdorff distance metrics. The results indicate that the U-Net-based segmentation consistently outperforms other techniques, producing more accurate and reliable 3D bone models. The user interface developed for this workflow facilitates intuitive visualization and manipulation of the 3D models, enhancing the usability and effectiveness of the segmentation process in both clinical and research settings. The findings suggest that the proposed deep learning-based workflow holds significant potential for improving the accuracy of bone segmentation and the quality of 3D models derived from CBCT scans, contributing to better diagnostic and pre-surgical planning outcomes in orthopedic practice. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Cell nuclei image segmentation using U-Net and DeepLabV3+ with transfer learning and regularization.
- Author
-
Koishiyeva, Dina, Sydybayeva, Madina, Belginova, Saule, Yeskendirova, Damelya, Azamatova, Zhanerke, Kalpebayev, Azamat, and Beketova, Gulzhanat
- Subjects
MACHINE learning ,COMPUTER vision ,CELL nuclei ,FEATURE extraction ,IMAGE segmentation - Abstract
Semantic nuclei segmentation is a challenging area of computer vision. Accurate nuclei segmentation can help medics in diagnosing many diseases. Automatic nuclei segmentation can help medics in diagnosing many diseases such as cancer by providing automatic tissue analysis. Deep learning algorithms allow automatic feature extraction from medical images, however, hematoxylin and eosin (H&E) stained images are challenging due to variability in staining and textures. Using pre-trained models in deep learning speeds up development and improves their performance. This paper compares Deeplabv3+ and U-Net deep learning methods with the pre-trained models ResNet-50 and EfficientNetB4 embedded in their architecture. In addition, different regularization and dropout parameters are applied to prevent overtraining. The experiment was conducted on the PanNuke dataset consisting of nearly 8,000 histological images and annotated nuclei. As a result, the ResNet50-based DeepLabV3+ model with L2 regularization of 0.02 and dropout of 0.7 showed efficiency with dice coefficient (DCS) of 0.8356, intersection over union (IOU) of 0.7280, and loss of 0.3212 on the test set. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Change Detection for Forest Ecosystems Using Remote Sensing Images with Siamese Attention U-Net.
- Author
-
Hewarathna, Ashen Iranga, Hamlin, Luke, Charles, Joseph, Vigneshwaran, Palanisamy, George, Romiyal, Thuseethan, Selvarajah, Wimalasooriya, Chathrie, and Shanmugam, Bharanidharan
- Subjects
FOREST monitoring ,CARBON sequestration ,DEEP learning ,REMOTE sensing ,LAND cover ,LANDSCAPE assessment - Abstract
Forest ecosystems are critical components of Earth's biodiversity and play vital roles in climate regulation and carbon sequestration. They face increasing threats from deforestation, wildfires, and other anthropogenic activities. Timely detection and monitoring of changes in forest landscapes pose significant challenges for government agencies. To address these challenges, we propose a novel pipeline by refining the U-Net design, including employing two different schemata of early fusion networks and a Siam network architecture capable of processing RGB images specifically designed to identify high-risk areas in forest ecosystems through change detection across different time frames in the same location. It annotates ground truth change maps in such time frames using an encoder–decoder approach with the help of an enhanced feature learning and attention mechanism. Our proposed pipeline, integrated with ResNeSt blocks and SE attention techniques, achieved impressive results in our newly created forest cover change dataset. The evaluation metrics reveal a Dice score of 39.03%, a kappa score of 35.13%, an F1-score of 42.84%, and an overall accuracy of 94.37%. Notably, our approach significantly outperformed multitasking model approaches in the ONERA dataset, boasting a precision of 53.32%, a Dice score of 59.97%, and an overall accuracy of 97.82%. Furthermore, it surpassed multitasking models in the HRSCD dataset, even without utilizing land cover maps, achieving a Dice score of 44.62%, a kappa score of 11.97%, and an overall accuracy of 98.44%. Although the proposed model had a lower F1-score than other methods, other performance metrics highlight its effectiveness in timely detection and forest landscape monitoring, advancing deep learning techniques in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Evaluating the Impact of Filtering Techniques on Deep Learning-Based Brain Tumour Segmentation.
- Author
-
Rosa, Sofia, Vasconcelos, Verónica, and Caridade, Pedro J. S. B.
- Subjects
CONTRAST-enhanced magnetic resonance imaging ,GREENHOUSE gases ,BRAIN tumors ,CONVOLUTIONAL neural networks ,SYMPTOMS - Abstract
Gliomas are a common and aggressive kind of brain tumour that is difficult to diagnose due to their infiltrative development, variable clinical presentation, and complex behaviour, making them an important focus in neuro-oncology. Segmentation of brain tumour images is critical for improving diagnosis, prognosis, and treatment options. Manually segmenting brain tumours is time-consuming and challenging. Automatic segmentation algorithms can significantly improve the accuracy and efficiency of tumour identification, thus improving treatment planning and outcomes. Deep learning-based segmentation tumours have shown significant advances in the last few years. This study evaluates the impact of four denoising filters, namely median, Gaussian, anisotropic diffusion, and bilateral, on tumour detection and segmentation. The U-Net architecture is applied for the segmentation of 3064 contrast-enhanced magnetic resonance images from 233 patients diagnosed with meningiomas, gliomas, and pituitary tumours. The results of this work demonstrate that bilateral filtering yields superior outcomes, proving to be a robust and computationally efficient approach in brain tumour segmentation. This method reduces the processing time by 12 epochs, which in turn contributes to lowering greenhouse gas emissions by optimizing computational resources and minimizing energy consumption. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Transfer of Periodic Phenomena in Multiphase Capillary Flows to a Quasi-Stationary Observation Using U-Net.
- Author
-
Oldach, Bastian, Wintermeyer, Philipp, and Kockmann, Norbert
- Subjects
THREE-dimensional imaging ,IMAGE analysis ,CAPILLARY flow ,ARTIFICIAL intelligence ,COMPUTER science - Abstract
Miniaturization promotes the efficiency and exploration domain in scientific fields such as computer science, engineering, medicine, and biotechnology. In particular, the field of microfluidics is a flourishing technology, which deals with the manipulation of small volumes of liquid. Dispersed droplets or bubbles in a second immiscible liquid are of great interest for screening applications or chemical and biochemical reactions. However, since very small dimensions are characterized by phenomena that differ from those at macroscopic scales, a deep understanding of physics is crucial for effective device design. Due to small volumes in miniaturized systems, common measurement techniques are not applicable as they exceed the dimensions of the device by a multitude. Hence, image analysis is commonly chosen as a method to understand ongoing phenomena. Artificial Intelligence is now the state of the art for recognizing patterns in images or analyzing datasets that are too large for humans to handle. X-ray-based Computer Tomography adds a third dimension to images, which results in more information, but ultimately, also in more complex image analysis. In this work, we present the application of the U-Net neural network to extract certain states during droplet formation in a capillary, which forms a constantly repeated process that is captured on tens of thousands of CT images. The experimental setup features a co-flow setup that is based on 3D-printed capillaries with two different cross-sections with an inner diameter, respectively edge length of 1.6 mm. For droplet formation, water was dispersed in silicon oil. The classification into different droplet states allows for 3D reconstruction and a time-resolved 3D analysis of the present phenomena. The original U-Net was modified to process input images of a size of 688 × 432 pixels while the structure of the encoder and decoder path feature 23 convolutional layers. The U-Net consists of four max pooling layers and four upsampling layers. The training was performed on 90% and validated on 10% of a dataset containing 492 images showing different states of droplet formation. A mean Intersection over Union of 0.732 was achieved for a training of 50 epochs, which is considered a good performance. The presented U-Net needs 120 ms per image to process 60,000 images to categorize emerging droplets into 24 states at 905 angles. Once the model is trained sufficiently, it provides accurate segmentation for various flow conditions. The selected images are used for 3D reconstruction enabling the 2D and 3D quantification of emerging droplets in capillaries that feature circular and square cross-sections. By applying this method, a temporal resolution of 25–40 ms was achieved. Droplets that are emerging in capillaries with a square cross-section become bigger under the same flow conditions in comparison to capillaries with a circular cross section. The presented methodology is promising for other periodic phenomena in different scientific disciplines that focus on imaging techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Synthetic Knee MRI T1p Maps as an Avenue for Clinical Translation of Quantitative Osteoarthritis Biomarkers.
- Author
-
Tong, Michelle, Tolpadi, Aniket, Bhattacharjee, Rupsa, Han, Misung, Pedoia, Valentina, and Majumdar, Sharmila
- Subjects
CNN ,MRI ,T1p map ,T2 map ,U-Net ,deep learning ,generative AI ,knee ,osteoarthritis ,synthesis - Abstract
A 2D U-Net was trained to generate synthetic T1p maps from T2 maps for knee MRI to explore the feasibility of domain adaptation for enriching existing datasets and enabling rapid, reliable image reconstruction. The network was developed using 509 healthy contralateral and injured ipsilateral knee images from patients with ACL injuries and reconstruction surgeries acquired across three institutions. Network generalizability was evaluated on 343 knees acquired in a clinical setting and 46 knees from simultaneous bilateral acquisition in a research setting. The deep neural network synthesized high-fidelity reconstructions of T1p maps, preserving textures and local T1p elevation patterns in cartilage with a normalized mean square error of 2.4% and Pearsons correlation coefficient of 0.93. Analysis of reconstructed T1p maps within cartilage compartments revealed minimal bias (-0.10 ms), tight limits of agreement, and quantification error (5.7%) below the threshold for clinically significant change (6.42%) associated with osteoarthritis. In an out-of-distribution external test set, synthetic maps preserved T1p textures, but exhibited increased bias and wider limits of agreement. This study demonstrates the capability of image synthesis to reduce acquisition time, derive meaningful information from existing datasets, and suggest a pathway for standardizing T1p as a quantitative biomarker for osteoarthritis.
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.