12 results on '"Vision Mamba"'
Search Results
2. Wind Field Reconstruction Method Using Incomplete Wind Data Based on Vision Mamba Decoder Network.
- Author
-
Chen, Min, Wang, Haonan, Chen, Wantong, and Ren, Shiyu
- Subjects
AIRLINE routes ,FLIGHT planning (Aeronautics) ,WIND speed ,AERONAUTICAL safety measures ,METEOROLOGY - Abstract
Accurate meteorological information is crucial for the safety of civil aviation flights. Complete wind field information is particularly helpful for planning flight routes. To address the challenge of accurately reconstructing wind fields, this paper introduces a deep learning neural network method based on the Vision Mamba Decoder. The goal of the method is to reconstruct the original complete wind field from incomplete wind data distributed along air routes. This paper proposes improvements to the Vision Mamba model to fit our mission, showing that the developed model can accurately reconstruct the complete wind field. The experimental results demonstrate a mean absolute error (MAE) of wind speed of approximately 1.83 m/s, a mean relative error (MRE) of around 7.87%, an R-square value of about 0.92, and an MAE of wind direction of 5.78 degrees. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. A Novel End-to-End Deep Learning Framework for Chip Packaging Defect Detection.
- Author
-
Zhou, Siyi, Yao, Shunhua, Shen, Tao, and Wang, Qingwang
- Subjects
- *
FACTORY inspection , *IMAGE segmentation , *DATA mining , *X-ray imaging , *DEEP learning - Abstract
As semiconductor chip manufacturing technology advances, chip structures are becoming more complex, leading to an increased likelihood of void defects in the solder layer during packaging. However, identifying void defects in packaged chips remains a significant challenge due to the complex chip background, varying defect sizes and shapes, and blurred boundaries between voids and their surroundings. To address these challenges, we present a deep-learning-based framework for void defect segmentation in chip packaging. The framework consists of two main components: a solder region extraction method and a void defect segmentation network. The solder region extraction method includes a lightweight segmentation network and a rotation correction algorithm that eliminates background noise and accurately captures the solder region of the chip. The void defect segmentation network is designed for efficient and accurate defect segmentation. To cope with the variability of void defect shapes and sizes, we propose a Mamba model-based encoder that uses a visual state space module for multi-scale information extraction. In addition, we propose an interactive dual-stream decoder that uses a feature correlation cross gate module to fuse the streams' features to improve their correlation and produce more accurate void defect segmentation maps. The effectiveness of the framework is evaluated through quantitative and qualitative experiments on our custom X-ray chip dataset. Furthermore, the proposed void defect segmentation framework for chip packaging has been applied to a real factory inspection line, achieving an accuracy of 93.3% in chip qualification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Visual State Space Model for Image Deraining with Symmetrical Scanning.
- Author
-
Zhang, Yaoqing, He, Xin, Zhan, Chunxia, and Li, Junjie
- Subjects
- *
TRANSFORMER models , *CONVOLUTIONAL neural networks , *IMAGE reconstruction , *VISION - Abstract
Image deraining aims to mitigate the adverse effects of rain streaks on image quality. Recently, the advent of convolutional neural networks (CNNs) and Vision Transformers (ViTs) has catalyzed substantial advancements in this field. However, these methods fail to effectively balance model efficiency and image deraining performance. In this paper, we propose an effective, locally enhanced visual state space model for image deraining, called DerainMamba. Specifically, we introduce a global-aware state space model to better capture long-range dependencies with linear complexity. In contrast to existing methods that utilize fixed unidirectional scan mechanisms, we propose a direction-aware symmetrical scanning module to enhance the feature capture of rain streak direction. Furthermore, we integrate a local-aware mixture of experts into our framework to mitigate local pixel forgetting, thereby enhancing the overall quality of high-resolution image reconstruction. Experimental results validate that the proposed method surpasses state-of-the-art approaches on six benchmark datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Bidirectional Copy–Paste Mamba for Enhanced Semi-Supervised Segmentation of Transvaginal Uterine Ultrasound Images.
- Author
-
Peng, Boyuan, Liu, Yiyang, Wang, Wenwen, Zhou, Qin, Fang, Li, and Zhu, Xin
- Subjects
- *
TRANSVAGINAL ultrasonography , *COMPUTER-aided diagnosis , *SUPERVISED learning , *ULTRASONIC imaging , *UTERINE diseases - Abstract
Automated perimetrium segmentation of transvaginal ultrasound images is an important process for computer-aided diagnosis of uterine diseases. However, ultrasound images often contain various structures and textures, and these structures have different shapes, sizes, and contrasts; therefore, accurately segmenting the parametrium region of the uterus in transvaginal uterine ultrasound images is a challenge. Recently, many fully supervised deep learning-based methods have been proposed for the segmentation of transvaginal ultrasound images. Nevertheless, these methods require extensive pixel-level annotation by experienced sonographers. This procedure is expensive and time-consuming. In this paper, we present a bidirectional copy–paste Mamba (BCP-Mamba) semi-supervised model for segmenting the parametrium. The proposed model is based on a bidirectional copy–paste method and incorporates a U-shaped structure model with a visual state space (VSS) module instead of the traditional sampling method. A dataset comprising 1940 transvaginal ultrasound images from Tongji Hospital, Huazhong University of Science and Technology is utilized for training and evaluation. The proposed BCP-Mamba model undergoes comparative analysis with two widely recognized semi-supervised models, BCP-Net and U-Net, across various evaluation metrics including Dice, Jaccard, average surface distance (ASD), and Hausdorff_95. The results indicate the superior performance of the BCP-Mamba semi-supervised model, achieving a Dice coefficient of 86.55%, surpassing both U-Net (80.72%) and BCP-Net (84.63%) models. The Hausdorff_95 of the proposed method is 14.56. In comparison, the counterparts of U-Net and BCP-Net are 23.10 and 21.34, respectively. The experimental findings affirm the efficacy of the proposed semi-supervised learning approach in segmenting transvaginal uterine ultrasound images. The implementation of this model may alleviate the expert workload and facilitate more precise prediction and diagnosis of uterine-related conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Efficient and Gender-Adaptive Graph Vision Mamba for Pediatric Bone Age Assessment
- Author
-
Zhou, Lingyu, Yi, Zhang, Zhou, Kai, Xu, Xiuyuan, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
- Published
- 2024
- Full Text
- View/download PDF
7. Sky images based photovoltaic power forecasting: A novel approach with optimized VMD and Vision Mamba
- Author
-
Chenhao Cai, Leyao Zhang, Jianguo Zhou, and Luming Zhou
- Subjects
Photovoltaic power forecasting ,Vision Mamba ,Variational mode decomposition ,Snow ablation optimization ,Technology - Abstract
As the global demand for sustainable energy sources continues to grow, accurate prediction of photovoltaic power generation is crucial for optimizing the utilization of solar resources and enhancing the efficiency of photovoltaic systems. To improve the accuracy of photovoltaic power forecasting, this paper proposes a novel hybrid predictive model that integrates Optimized Variational Mode Decomposition (VMD), Vision Mamba (Vim) for extracting features from sky images, and advanced mechanisms like Patch Embedding and Variate-wise Cross-Attention. Initially, the proposed model employs SAO-optimized VMD to decompose the photovoltaic power series into high, medium, and low-frequency components. Subsequently, these components are patched to serve as input for the subsequent layers. In the third step, exogenous variables, including meteorological and image data, are introduced and processed through Variate Embedding combined with cross-attention mechanisms to capture the intricate interactions between these variables. Finally, by integrating the outputs from all processing steps through normalization and feed-forward layers, the final predictive results are produced. Experimental evaluations across different seasons demonstrate significant enhancements in forecasting accuracy, with the model achieving Root Mean Square Error (RMSE) values of 0.3587 in spring, 0.4376 in summer, 0.3544 in autumn, and 0.3493 in winter. Similarly, Mean Absolute Error (MAE) and Mean Squared Error (MSE) across these seasons underscore the model's effectiveness. This model offers new technical means for photovoltaic power forecasting and provides valuable decision support for the optimization and management of photovoltaic power systems.
- Published
- 2024
- Full Text
- View/download PDF
8. Wind Field Reconstruction Method Using Incomplete Wind Data Based on Vision Mamba Decoder Network
- Author
-
Min Chen, Haonan Wang, Wantong Chen, and Shiyu Ren
- Subjects
wind nowcasting ,state space models ,Vision Mamba ,meteorology ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
Accurate meteorological information is crucial for the safety of civil aviation flights. Complete wind field information is particularly helpful for planning flight routes. To address the challenge of accurately reconstructing wind fields, this paper introduces a deep learning neural network method based on the Vision Mamba Decoder. The goal of the method is to reconstruct the original complete wind field from incomplete wind data distributed along air routes. This paper proposes improvements to the Vision Mamba model to fit our mission, showing that the developed model can accurately reconstruct the complete wind field. The experimental results demonstrate a mean absolute error (MAE) of wind speed of approximately 1.83 m/s, a mean relative error (MRE) of around 7.87%, an R-square value of about 0.92, and an MAE of wind direction of 5.78 degrees.
- Published
- 2024
- Full Text
- View/download PDF
9. A Novel End-to-End Deep Learning Framework for Chip Packaging Defect Detection
- Author
-
Siyi Zhou, Shunhua Yao, Tao Shen, and Qingwang Wang
- Subjects
chip packaging defect detection ,Vision Mamba ,dual-stream decoder ,feature correlation ,X-ray image segmentation ,Chemical technology ,TP1-1185 - Abstract
As semiconductor chip manufacturing technology advances, chip structures are becoming more complex, leading to an increased likelihood of void defects in the solder layer during packaging. However, identifying void defects in packaged chips remains a significant challenge due to the complex chip background, varying defect sizes and shapes, and blurred boundaries between voids and their surroundings. To address these challenges, we present a deep-learning-based framework for void defect segmentation in chip packaging. The framework consists of two main components: a solder region extraction method and a void defect segmentation network. The solder region extraction method includes a lightweight segmentation network and a rotation correction algorithm that eliminates background noise and accurately captures the solder region of the chip. The void defect segmentation network is designed for efficient and accurate defect segmentation. To cope with the variability of void defect shapes and sizes, we propose a Mamba model-based encoder that uses a visual state space module for multi-scale information extraction. In addition, we propose an interactive dual-stream decoder that uses a feature correlation cross gate module to fuse the streams’ features to improve their correlation and produce more accurate void defect segmentation maps. The effectiveness of the framework is evaluated through quantitative and qualitative experiments on our custom X-ray chip dataset. Furthermore, the proposed void defect segmentation framework for chip packaging has been applied to a real factory inspection line, achieving an accuracy of 93.3% in chip qualification.
- Published
- 2024
- Full Text
- View/download PDF
10. Visual State Space Model for Image Deraining with Symmetrical Scanning
- Author
-
Yaoqing Zhang, Xin He, Chunxia Zhan, and Junjie Li
- Subjects
image deraining ,rain removal ,visual state space ,vision Mamba ,Mathematics ,QA1-939 - Abstract
Image deraining aims to mitigate the adverse effects of rain streaks on image quality. Recently, the advent of convolutional neural networks (CNNs) and Vision Transformers (ViTs) has catalyzed substantial advancements in this field. However, these methods fail to effectively balance model efficiency and image deraining performance. In this paper, we propose an effective, locally enhanced visual state space model for image deraining, called DerainMamba. Specifically, we introduce a global-aware state space model to better capture long-range dependencies with linear complexity. In contrast to existing methods that utilize fixed unidirectional scan mechanisms, we propose a direction-aware symmetrical scanning module to enhance the feature capture of rain streak direction. Furthermore, we integrate a local-aware mixture of experts into our framework to mitigate local pixel forgetting, thereby enhancing the overall quality of high-resolution image reconstruction. Experimental results validate that the proposed method surpasses state-of-the-art approaches on six benchmark datasets.
- Published
- 2024
- Full Text
- View/download PDF
11. Bidirectional Copy–Paste Mamba for Enhanced Semi-Supervised Segmentation of Transvaginal Uterine Ultrasound Images
- Author
-
Boyuan Peng, Yiyang Liu, Wenwen Wang, Qin Zhou, Li Fang, and Xin Zhu
- Subjects
semi-supervised learning ,transvaginal ultrasound ,uterus perimetrium ,Vision Mamba ,Medicine (General) ,R5-920 - Abstract
Automated perimetrium segmentation of transvaginal ultrasound images is an important process for computer-aided diagnosis of uterine diseases. However, ultrasound images often contain various structures and textures, and these structures have different shapes, sizes, and contrasts; therefore, accurately segmenting the parametrium region of the uterus in transvaginal uterine ultrasound images is a challenge. Recently, many fully supervised deep learning-based methods have been proposed for the segmentation of transvaginal ultrasound images. Nevertheless, these methods require extensive pixel-level annotation by experienced sonographers. This procedure is expensive and time-consuming. In this paper, we present a bidirectional copy–paste Mamba (BCP-Mamba) semi-supervised model for segmenting the parametrium. The proposed model is based on a bidirectional copy–paste method and incorporates a U-shaped structure model with a visual state space (VSS) module instead of the traditional sampling method. A dataset comprising 1940 transvaginal ultrasound images from Tongji Hospital, Huazhong University of Science and Technology is utilized for training and evaluation. The proposed BCP-Mamba model undergoes comparative analysis with two widely recognized semi-supervised models, BCP-Net and U-Net, across various evaluation metrics including Dice, Jaccard, average surface distance (ASD), and Hausdorff_95. The results indicate the superior performance of the BCP-Mamba semi-supervised model, achieving a Dice coefficient of 86.55%, surpassing both U-Net (80.72%) and BCP-Net (84.63%) models. The Hausdorff_95 of the proposed method is 14.56. In comparison, the counterparts of U-Net and BCP-Net are 23.10 and 21.34, respectively. The experimental findings affirm the efficacy of the proposed semi-supervised learning approach in segmenting transvaginal uterine ultrasound images. The implementation of this model may alleviate the expert workload and facilitate more precise prediction and diagnosis of uterine-related conditions.
- Published
- 2024
- Full Text
- View/download PDF
12. Enhancing pixel-level crack segmentation with visual mamba and convolutional networks.
- Author
-
Han, Chengjia, Yang, Handuo, and Yang, Yaowen
- Subjects
- *
CONVOLUTIONAL neural networks , *CRACKING of pavements , *ARTIFICIAL intelligence , *SAMPLE size (Statistics) - Abstract
Computer vision-based semantic segmentation methods are currently the most widely used for automated detection of structural cracks in buildings and pavements. However, these methods face persistent challenges in detecting fine cracks with small widths and in distinguishing cracks from background stains. This paper addresses these issues by introducing MambaCrackNet, a new network architecture for pixel-level crack segmentation. MambaCrackNet incorporates residual visual Mamba blocks and integrates visual Mamba and convolutional neural network-based segmentation techniques. This approach effectively enhances the detection of fine cracks, reduces misdetections of background stains, and remains robust to variations in patch size and training sample sizes, making it highly practical for engineering applications. On two open access crack datasets, MambaCrackNet outperformed mainstream crack segmentation models, achieving MIoU scores of 0.8939 and 0.8560 and F1-scores of 0.8817 and 0.8412. • A crack segmentation model integrating the CNN-based and ViT-based techniques is proposed. • MambaCrackNet reduces the false detection of background stains that are similar to cracks. • MambaCrackNet improves the detection of complex and fine cracks. • On the two open access dataset, MambaCrackNet achieves SOTA with 0.8817 and 0.8412 MIoU. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.