2,563 results on '"Instance segmentation"'
Search Results
2. Cross-Task Data Augmentation by Pseudo-Label Generation for Region Based Coronary Artery Instance Segmentation
- Author
-
Pokhrel, Sandesh, Bhandari, Sanjay, Vazquez, Eduard, Shrestha, Yash Raj, Bhattarai, Binod, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bhattarai, Binod, editor, Ali, Sharib, editor, Rau, Anita, editor, Caramalau, Razvan, editor, Nguyen, Anh, editor, Gyawali, Prashnna, editor, Namburete, Ana, editor, and Stoyanov, Danail, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Learning Instance-Discriminative Pixel Embeddings Using Pixel Triplets
- Author
-
Chen, Long, Merhof, Dorit, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Xu, Xuanang, editor, Cui, Zhiming, editor, Rekik, Islem, editor, Ouyang, Xi, editor, and Sun, Kaicong, editor
- Published
- 2025
- Full Text
- View/download PDF
4. MarineInst: A Foundation Model for Marine Image Analysis with Instance Visual Description
- Author
-
Zheng, Ziqiang, Chen, Yiwei, Zeng, Huimin, Vu, Tuan-Anh, Hua, Binh-Son, Yeung, Sai-Kit, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
5. Instance-Based CycleGAN for Object Segmentation with Few Annotations
- Author
-
Díaz Estrada, David N., Robert, Olivier, Kresovic, Milan, Torres, Cindy, Muselet, Damien, Tremeau, Alain, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Schettini, Raimondo, editor, Trémeau, Alain, editor, Tominaga, Shoji, editor, Bianco, Simone, editor, and Buzzelli, Marco, editor
- Published
- 2025
- Full Text
- View/download PDF
6. Deep Learning-Based Instance Segmentation of Neural Progenitor Cell Nuclei in Fluorescence Microscopy Images
- Author
-
Pérez, Gabriel, Russo, Claudia Cecilia, Palumbo, Maria Laura, Moroni, Alejandro David, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Naiouf, Marcelo, editor, De Giusti, Laura, editor, Chichizola, Franco, editor, and Libutti, Leandro, editor
- Published
- 2025
- Full Text
- View/download PDF
7. YOLOv8E: an efficient YOLOv8 method for instance segmentation of individual tree crowns in Wellington City, New Zealand.
- Author
-
Sun, Ziyi, Xue, Bing, Zhang, Mengjie, and Schindler, Jan
- Abstract
Instance segmentation is crucial for analysing individual tree crowns in aerial imagery, which plays an important role in forest management, risk modelling, biodiversity modelling, forest health and human wellbeing studies. Traditional instance segmentation methods struggle in diverse rural landscapes where canopy images feature primarily small and medium tree objects, varying from isolated trees to dense forest stands. This paper introduces YOLOv8E, a new and efficient YOLOv8 method, optimised for precise instance segmentation and species classification of tree crowns. This method includes new schemes for selecting candidate positive samples for each instance and a refined network design tailored for small and medium-sized tree crowns. Adjustments in hyperparameters, particularly within the Task-Aligned Assigner, are also discussed to better suit canopy segmentation tasks. Comprehensive experiments conducted on the datasets for Wellington City, Aotearoa New Zealand, demonstrate that YOLOv8E outperforms a number of recent methods, achieving 36.1 and 32.2 in terms of the Box AP and Mask AP metrics respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Instance segmentation models for detecting floating macroplastic debris from river surface images.
- Author
-
Kataoka, Tomoya, Yoshida, Takushi, and Yamamoto, Natsuki
- Subjects
IMAGE segmentation ,WATER levels ,CAMERAS ,ULTRASONICS ,ALGORITHMS - Abstract
Quantifying the transport of floating macroplastic debris (FMPD) in waterways is essential for understanding the plastic emission from land. However, no robust tool has been developed to monitor FMPD. Here, to detect FMPD on river surfaces, we developed five instance segmentation models based on state-of-the-art You Only Look Once (YOLOv8) architecture using 7,356 training images collected via fixed-camera monitoring of seven rivers. Our models could detect FMPD using object detection and image segmentation approaches with accuracies similar to those of the pretrained YOLOv8 model. Our model performances were tested using 3,802 images generated from 107 frames obtained by a novel camera system embedded in an ultrasonic water level gauge (WLGCAM) installed in three rivers. Interestingly, the model with intermediate weight parameters most accurately detected FMPD, whereas the model with the most parameters exhibited poor performance due to overfitting. Additionally, we assessed the dependence of the detection performance on the ground sampling distance (GSD) and found that a smaller GSD for image segmentation approach and larger GSD for object detection approach are capable of accurately detecting FMPD. Based on the results from our study, more appropriate category selections need to be determined to improve the model performance and reduce the number of false positives. Our study can aid in the development of guidelines for monitoring FMPD and the establishment of an algorithm for quantifying the transport of FMPD. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. AI-Assisted Detection and Localization of Spinal Metastatic Lesions.
- Author
-
Edelmers, Edgars, Ņikuļins, Artūrs, Sprūdža, Klinta Luīze, Stapulone, Patrīcija, Pūce, Niks Saimons, Skrebele, Elizabete, Siņicina, Everita Elīna, Cīrule, Viktorija, Kazuša, Ance, and Boločko, Katrina
- Subjects
- *
BONE metastasis , *COMPUTER-assisted image analysis (Medicine) , *COMPUTED tomography , *ARTIFICIAL intelligence , *RADIOMICS - Abstract
Objectives: The integration of machine learning and radiomics in medical imaging has significantly advanced diagnostic and prognostic capabilities in healthcare. This study focuses on developing and validating an artificial intelligence (AI) model using U-Net architectures for the accurate detection and segmentation of spinal metastases from computed tomography (CT) images, addressing both osteolytic and osteoblastic lesions. Methods: Our methodology employs multiple variations of the U-Net architecture and utilizes two distinct datasets: one consisting of 115 polytrauma patients for vertebra segmentation and another comprising 38 patients with documented spinal metastases for lesion detection. Results: The model demonstrated strong performance in vertebra segmentation, achieving Dice Similarity Coefficient (DSC) values between 0.87 and 0.96. For metastasis segmentation, the model achieved a DSC of 0.71 and an F-beta score of 0.68 for lytic lesions but struggled with sclerotic lesions, obtaining a DSC of 0.61 and an F-beta score of 0.57, reflecting challenges in detecting dense, subtle bone alterations. Despite these limitations, the model successfully identified isolated metastatic lesions beyond the spine, such as in the sternum, indicating potential for broader skeletal metastasis detection. Conclusions: The study concludes that AI-based models can augment radiologists' capabilities by providing reliable second-opinion tools, though further refinements and diverse training data are needed for optimal performance, particularly for sclerotic lesion segmentation. The annotated CT dataset produced and shared in this research serves as a valuable resource for future advancements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. A Review of Semantic Segmentation and Instance Segmentation Techniques in Forestry Using LiDAR and Imagery Data.
- Author
-
Wołk, Krzysztof and Tatara, Marek S.
- Subjects
FOREST surveys ,ENVIRONMENTAL monitoring ,FORESTS & forestry ,DATA quality ,LIDAR - Abstract
The objective of this review is to conduct a critical analysis of the current literature pertaining to segmentation techniques and provide a methodical summary of their impact on forestry-related activities, emphasizing their applications using LiDAR and imagery data. This review covers the challenges, progress, and application of these strategies in ecological monitoring, forest inventory, and tree species classification. Through the process of synthesizing pivotal discoveries from multiple studies, this comprehensive analysis provides valuable perspectives on the present status of research and highlights prospective areas for further exploration. The primary topics addressed encompass the approach employed for executing the examination, the fundamental discoveries associated with semantic segmentation and instance segmentation in the domain of forestry, and the ramifications of these discoveries for the discipline. This review highlights the effectiveness of semantic and instance segmentation techniques in forestry applications, such as precise tree species identification and individual tree monitoring. However, challenges such as occlusions, overlapping branches, and varying data quality remain. Future research should focus on overcoming these obstacles to enhance the precision and applicability of these segmentation methodologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Object Detection on Real-Time Video with FPN and Modified Mask RCNN Based on Inception-ResNetV2.
- Author
-
Yadav, Anu and Kumar, Ela
- Subjects
OBJECT recognition (Computer vision) ,FEATURE extraction ,LEARNING ability ,RESEARCH personnel ,PROBLEM solving ,DEEP learning - Abstract
Instance segmentation of Real-time video is a crucial step in the object identification and classification process. Object detection is the task of finding different types of information about the object in a video by masking and bounding a rectangular box on the object's position in the image. Deep learning advances in the field of object identification by utilizing its excellent feature learning ability. Numerous researchers have employed various deep-learning methods to perform object detection with the goal of improving the precision of feature extraction. Due to the poor extraction features of the video frame, the higher and lower-level features of an object from the video frame are not extracted properly. Hence, the Feature Pyramid Network (FPN) integrated Modified Mask RCNN based on Inception-ResNetV2 is employed to extract the higher and lower level features from the video in order to solve this problem. In this designed model, the video dataset is converted to frames, and the features from the lower and higher level of the video frame are extracted using the FPN and backbone (Inception-ResNetV2) of the designed model. The automatic selective approach of the regions for the detection of an object is made by using the Regional Proposal Network, and the selected region is aligned using Region of Interest. From the aligned image, the fully convoluted layer is used for boxing and class detection of the object. Then, the convoluted layers are used for masking the detected object. In order to evaluate object detection on real-time video using Modified Mask RCNN, the performance metrics such as Accuracy, Precision, and Recall attained by the proposed model for the CoCo dataset are 0.98, 0.93, 0.94, which results in better values than the existing approaches including RCNN, SWINV2-G, Mask RCNN, SWINV2-L, and Fast RCNN. As a result, the developed model accurately and rapidly differentiates the object from the real-time video. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Development of Segmentation Technology for Fall Risk Areas in Small-Scale Construction Sites Based on Bird's-eye-view Images.
- Author
-
Na Jong-ho, Lee Jae-kang, Shin Hyu-soung, and Yun Il-dong
- Subjects
BUILDING sites ,ARTIFICIAL intelligence ,CLASSIFICATION ,SAFETY - Abstract
Construction sites have shown the highest incidence of safety accidents across industries in recent times. Small-scale sites, in particular, often operate without on-site safety managers, leading to significant safety oversights. In this study, we developed a method of identifying risk areas during construction procedures by using bird's-eye-view image data throughout the construction cycle. Actual construction site images were collected and specific target objects were selected to create an AI training dataset. The segmentation model's performance was validated, and a system was developed to identify fall risk areas by establishing interconnections between these target objects within the images. The findings of this study can help enhance compliance assessment with construction procedures and improve safety management oversight at small-scale construction sites. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. SMLS-YOLO: an extremely lightweight pathological myopia instance segmentation method.
- Author
-
Hanfei Xie, Baoxi Yuan, Chengyu Hu, Yujie Gao, Feng Wang, Yuqian Wang, Chunlan Wang, and Peng Chu
- Subjects
FEATURE extraction ,VISION disorders ,PHYSICIANS ,BLINDNESS ,EXPERTISE - Abstract
Pathological myopia is a major cause of blindness among people under 50 years old and can result in severe vision loss in extreme cases. Currently, its detection primarily relies on manual methods, which are slow and heavily dependent on the expertise of physicians, making them impractical for large-scale screening. To tackle these challenges, we propose SMLS-YOLO, an instance segmentation method based on YOLOv8n-seg. Designed for efficiency in largescale screenings, SMLS-YOLO employs an extremely lightweight model. First, StarNet is introduced as the backbone of SMLS-YOLO to extract image features. Subsequently, the StarBlock from StarNet is utilized to enhance the C2f, resulting in the creation of the C2f-Star feature extraction module. Furthermore, shared convolution and scale reduction strategies are employed to optimize the segmentation head for a more lightweight design. Lastly, the model incorporates the Multi-Head Self-Attention (MHSA) mechanism following the backbone to further refine the feature extraction process. Experimental results on the pathological myopia dataset demonstrate that SMLS-YOLO outperforms the baseline YOLOv8n-seg by reducing model parameters by 46.9%, increasing Box mAP@0.5 by 2.4%, and enhancing Mask mAP@0.5 by 4%. Furthermore, when compared to other advanced instance segmentation and semantic segmentation algorithms, SMLS-YOLO also maintains a leading position, suggesting that SMLS-YOLO has promising applications in the segmentation of pathological myopia images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. A dual-labeled dataset and fusion model for automatic teeth segmentation, numbering, and state assessment on panoramic radiographs.
- Author
-
Zhou, Wenbo, Lu, Xin, Zhao, Dan, Jiang, Meng, Fan, Linlin, Zhang, Weihang, Li, Fenglin, Wang, Dezhou, Yin, Weihuang, and Liu, Xin
- Subjects
DENTAL radiography ,STATISTICAL models ,DENTAL care ,MEDICAL informatics ,RESEARCH funding ,DENTAL casting ,DESCRIPTIVE statistics ,DEEP learning ,PANORAMIC radiography ,AUTOMATION ,DIGITAL image processing - Abstract
Background: Recently, deep learning has been increasingly applied in the field of dentistry. The aim of this study is to develop a model for the automatic segmentation, numbering, and state assessment of teeth on panoramic radiographs. Methods: We created a dual-labeled dataset on panoramic radiographs for training, incorporating both numbering and state labels. We then developed a fusion model that combines a YOLOv9-e instance segmentation model with an EfficientNetv2-l classification model. The instance segmentation model is used for tooth segmentation and numbering, whereas the classification model is used for state evaluation. The final prediction results integrate tooth position, numbering, and state information. The model's output includes result visualization and automatic report generation. Results: Precision, Recall, mAP50 (mean Average Precision), and mAP50-95 for the tooth instance segmentation task are 0.989, 0.955, 0.975, and 0.840, respectively. Precision, Recall, Specificity, and F1 Score for the tooth classification task are 0.943, 0.933, 0.985, and 0.936, respectively. Conclusions: This fusion model is the first to integrate automatic dental segmentation, numbering, and state assessment. It provides highly accurate results, including detailed visualizations and automated report generation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A Deep Learning Biomimetic Milky Way Compass.
- Author
-
Tao, Yiting, Lucas, Michael, Perera, Asanka, Teague, Samuel, McIntyre, Timothy, Ogunwa, Titilayo, Warrant, Eric, and Chahl, Javaan
- Subjects
- *
MILKY Way , *DEEP learning , *FIELD research , *INSECTS , *ALGORITHMS - Abstract
Moving in straight lines is a behaviour that enables organisms to search for food, move away from threats, and ultimately seek suitable environments in which to survive and reproduce. This study explores a vision-based technique for detecting a change in heading direction using the Milky Way (MW), one of the navigational cues that are known to be used by night-active insects. An algorithm is proposed that combines the YOLOv8m-seg model and normalised second central moments to calculate the MW orientation angle. This method addresses many likely scenarios where segmentation of the MW from the background by image thresholding or edge detection is not applicable, such as when the moon is substantial or when anthropogenic light is present. The proposed YOLOv8m-seg model achieves a segment mAP@0.5 of 84.7 % on the validation dataset using our own training dataset of MW images. To explore its potential role in autonomous system applications, we compare night sky imagery and GPS heading data from a field trial in rural South Australia. The comparison results show that for short-term navigation, the segmented MW image can be used as a reliable orientation cue. There is a difference of roughly 5–10° between the proposed method and GT as the path involves left or right 90° turns at certain locations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Defective Pennywort Leaf Detection Using Machine Vision and Mask R-CNN Model.
- Author
-
Chowdhury, Milon, Reza, Md Nasim, Jin, Hongbin, Islam, Sumaiya, Lee, Geung-Joo, and Chung, Sun-Ok
- Subjects
- *
COMPUTER vision , *RECOGNITION (Psychology) , *MARKET value , *FERTIGATION , *FARMERS - Abstract
Demand and market value for pennywort largely depend on the quality of the leaves, which can be affected by various ambient environment or fertigation variables during cultivation. Although early detection of defects in pennywort leaves would enable growers to take quick action, conventional manual detection is laborious and time consuming as well as subjective. Therefore, the objective of this study was to develop an automatic leaf defect detection algorithm for pennywort plants grown under controlled environment conditions, using machine vision and deep learning techniques. Leaf images were captured from pennywort plants grown in an ebb-and-flow hydroponic system under fluorescent light conditions in a controlled plant factory environment. Physically or biologically damaged leaves (e.g., curled, creased, discolored, misshapen, or brown spotted) were classified as defective leaves. Images were annotated using an online tool, and Mask R-CNN models were implemented with the integrated attention mechanisms, convolutional block attention module (CBAM) and coordinate attention (CA) and compared for improved image feature extraction. Transfer learning was employed to train the model with a smaller dataset, effectively reducing processing time. The improved models demonstrated significant advancements in accuracy and precision, with the CA-augmented model achieving the highest metrics, including a mean average precision (mAP) of 0.931 and an accuracy of 0.937. These enhancements enabled more precise localization and classification of leaf defects, outperforming the baseline Mask R-CNN model in complex visual recognition tasks. The final model was robust, effectively distinguishing defective leaves in challenging scenarios, making it highly suitable for applications in precision agriculture. Future research can build on this modeling framework, exploring additional variables to identify specific leaf abnormalities at earlier growth stages, which is crucial for production quality assurance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. High-Precision Instance Segmentation Detection of Micrometer-Scale Primary Carbonitrides in Nickel-Based Superalloys for Industrial Applications.
- Author
-
Zhang, Jie, Zheng, Haibin, Zeng, Chengwei, and Gu, Changlong
- Subjects
- *
HEAT resistant alloys , *DEEP learning , *ELECTRONIC data processing , *ALLOYS , *HOMOGENEITY - Abstract
In industrial production, the identification and characterization of micron-sized second phases, such as carbonitrides in alloys, hold significant importance for optimizing alloy compositions and processes. However, conventional methods based on threshold segmentation suffer from drawbacks, including low accuracy, inefficiency, and subjectivity. Addressing these limitations, this study introduced a carbonitride instance segmentation model tailored for various nickel-based superalloys. The model enhanced the YOLOv8n network structure by integrating the SPDConv module and the P2 small target detection layer, thereby augmenting feature fusion capability and small target detection performance. Experimental findings demonstrated notable improvements: the mAP50 (Box) value increased from 0.676 to 0.828, and the mAP50 (Mask) value from 0.471 to 0.644 for the enhanced YOLOv8n model. The proposed model for carbonitride detection surpassed traditional threshold segmentation methods, meeting requirements for precise, rapid, and batch-automated detection in industrial settings. Furthermore, to assess the carbonitride distribution homogeneity, a method for quantifying dispersion uniformity was proposed and integrated into a data processing framework for seamless automation from prediction to analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. CHiMP: deep‐learning tools trained on protein crystallization micrographs to enable automation of experiments.
- Author
-
King, Oliver N. F., Levik, Karl E., Sandy, James, and Basham, Mark
- Subjects
- *
OBJECT recognition (Computer vision) , *IMAGE recognition (Computer vision) , *LIGHT sources , *DEEP learning , *PROTEIN analysis - Abstract
A group of three deep‐learning tools, referred to collectively as CHiMP (Crystal Hits in My Plate), were created for analysis of micrographs of protein crystallization experiments at the Diamond Light Source (DLS) synchrotron, UK. The first tool, a classification network, assigns images into categories relating to experimental outcomes. The other two tools are networks that perform both object detection and instance segmentation, resulting in masks of individual crystals in the first case and masks of crystallization droplets in addition to crystals in the second case, allowing the positions and sizes of these entities to be recorded. The creation of these tools used transfer learning, where weights from a pre‐trained deep‐learning network were used as a starting point and repurposed by further training on a relatively small set of data. Two of the tools are now integrated at the VMXi macromolecular crystallography beamline at DLS, where they have the potential to absolve the need for any user input, both for monitoring crystallization experiments and for triggering in situ data collections. The third is being integrated into the XChem fragment‐based drug‐discovery screening platform, also at DLS, to allow the automatic targeting of acoustic compound dispensing into crystallization droplets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Semi-supervised multi-class tree crown delineation using aerial multispectral imagery and lidar data.
- Author
-
Dersch, S., Schöttl, A., Krzystek, P., and Heurich, M.
- Subjects
- *
SUPERVISED learning , *EUROPEAN beech , *SILVER fir , *MIXED forests , *DECIDUOUS forests - Abstract
The segmentation of individual trees based on deep learning is more accurate than conventional meth- ods. However, a sufficient amount of training data is mandatory to leverage the accuracy potential of deep learning-based approaches. Semi-supervised learning techniques, by contrast, can help simplify the time-consuming labelling process. In this study, we introduce a new semi-supervised tree segmen- tation approach for the precise delineation and classification of individual trees that takes advantage of pre-clustered tree training labels. Specifically, the instance segmentation Mask R-CNN is combined with the normalized cut clustering method, which is applied to lidar point clouds. The study areas were located in the Bavarian Forest National Park, southeast Germany, where the tree composition includes coniferous, deciduous and mixed forest. Important tree species are European beech (Fagus sylvatica), Norway spruce (Picea abies) and silver fir (Abies alba). Multispectral image data with a ground sample distance of 10 cm and laser scanning data with a point density of approximately 55 points/m 2 were acquired in June 2017. From the laser scanning data, three-channel images with a resolution of 10 cm were generated. The models were tested in seven reference plots in the national park, with a total of 516 trees measured on the ground. When the color infrared images were used, the experiments demonstrated that the Mask R-CNN models, trained with the tree labels generated through lidar-based clustering, yielded mean F1 scores of 79 % that were up to 18 % higher than those of the normalized cut baseline method and thus significantly improved. Similarly, the mean over- all accuracy of the classification results for the coniferous, deciduous, and standing deadwood tree groups was 96 % and enhanced by up to 6 % compared with the baseline classification approach. The experiments with lidar-based images yielded slightly worse (1–2 %) results both for segmentation and for classification. Our study demonstrates the utility of this simplified training data preparation pro- cedure, which leads to models trained with significantly larger amounts of data than is feasible with with manual labelling. The accuracy improvement of up to 18 % in terms of the F1 score is further evidence of its advantages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Gateinst: instance segmentation with multi-scale gated-enhanced queries in transformer decoder.
- Author
-
Lin, Chih-Wei, Lin, Ye, Zhou, Shangtai, and Zhu, Lirong
- Abstract
Recently, a popular query-based end-to-end framework has been used for instance segmentation. However, queries update based on individual layers or scales of feature maps at each stage of Transformer decoding, which makes queries unable to gather sufficient multi-scale feature information. Therefore, querying these features may result in inconsistent information due to disparities among feature maps and leading to erroneous updates. This study proposes a new network called GateInst, which employs a dual-path auto-select mechanism based on gate structures to overcome these issues. Firstly, we design a block-wise multi-scale feature fusion module that combines features of different scales while maintaining low computational cost. Secondly, we introduce the gated-enhanced queries Transformer decoder that utilizes a gating mechanism to filter and merge the queries generated at different stages to compensate for the inaccuracies in updating queries. GateInst addresses the issue of insufficient feature information and compensates for the problem of cumulative errors in queries. Experiments have shown that GateInst achieves significant gains of 8.4 AP, 5.5 A P 50 over Mask2Former on the self-collected Tree Species Instance Dataset and performs well compared to non-Mask2Former-like and Mask2Former-like networks on self-collected and public COCO datasets, with only a tiny amount of additional computational cost and fast convergence. Code and models are available at . [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Investigation of the prediction of wildlife animals and its deployment using the robot.
- Author
-
Kaur, Parminder, Kansal, Sachin, and Singh, Varinder P.
- Abstract
Monitoring Wildlife in their natural habitat requires direct human intervention. Some animals are scared of humans. In such situations, camera‐equipped devices are implemented to gain a clear picture of Wildlife. Objective: Current wildlife detection models detect and classify the animal from camera‐captured images, limiting the action to rescue or save them from mishaps. Also, the camera‐equipped devices are fixed at particular locations. Therefore, an efficient detection model capable of protecting the animal has potential to play an important role. Method: To this end, we present Pred‐WAR, a Convolution Neural Network (CNN)‐based image classification approach to detect and raise rescue alerts for real‐time Wildlife. In our approach, we have proposed a Mask Region‐based CNN (Mask RCNN or MRCNN) with an Automatic Mixed Precision model that is implemented on a Robot Operating System‐based mobile robot with Raspberry Pi4 to detect and raise acoustic of Lion or alarm to alert or rescue animal in real‐time. Results: Pred‐WAR obtained a mean Average Precision value of 85.47% and an F1 score of 87.73% with a precision value range between 92% to 99%, outperforming the current MRCNN model. Significance: This approach has fast computation speed and maintains accuracy that will be efficiently implemented in real‐time scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. MRI-Based Brain Tumor Instance Segmentation Using Mask R-CNN.
- Author
-
Nasrudin, Muhammad
- Subjects
BRAIN tumors ,MAGNETIC resonance imaging ,IMAGE analysis ,DEEP learning ,DIAGNOSTIC imaging - Abstract
Brain tumor segmentation is a crucial step in medical image analysis for the accurate diagnosis and treatment of patients. Traditional methods for tumor segmentation often require extensive manual effort and are prone to variability. In this study, we propose an automated approach for brain tumor segmentation using Mask R-CNN, a state-of-the-art deep learning model for instance segmentation. Our method leverages MRI images to identify and delineate brain tumors with high precision. We trained the Mask R-CNN model on a dataset of annotated MRI images and evaluated its performance using the mean Average Precision (mAP) metric. The results demonstrate that our model achieves a high mAP of 90.3%, indicating its effectiveness in accurately segmenting brain tumors. This automated approach not only reduces the manual effort required for tumor segmentation but also provides consistent and reliable results, potentially improving clinical outcomes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. 基于改进YOLOv8的堆叠零件实例分割研究.
- Author
-
王众玄, 邹光明, 顾浩文, 许艳涛, and 李陈佳瑞
- Subjects
INDUSTRIAL robots ,FEATURE extraction ,SPINE - Abstract
Copyright of Machine Tool & Hydraulics is the property of Guangzhou Mechanical Engineering Research Institute (GMERI) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
24. TSPconv-Net: Transformer and Sparse Convolution for 3D Instance Segmentation in Point Clouds.
- Author
-
Ning, Xiaojuan, Liu, Yule, Ma, Yishu, Lu, Zhiwei, Jin, Haiyan, Shi, Zhenghao, and Wang, Yinghui
- Subjects
- *
MULTILAYER perceptrons , *FEATURE extraction , *DEEP learning , *TRANSFORMER models , *POINT cloud - Abstract
Current deep learning approaches for indoor 3D instance segmentation often rely on multilayer perceptrons (MLPs) for feature extraction. However, MLPs struggle to effectively capture the complex spatial relationships inherent in 3D scene data. To address this issue, we propose a novel and efficient framework for 3D instance segmentation called TSPconv-Net. In contrast to existing methods that primarily depend on MLPs for feature extraction, our framework integrates a more robust feature extraction model comprising the offset-attention (OA) mechanism and submanifold sparse convolution (SSC). The proposed framework is an end-to-end network architecture. TSPconv-Net consists of a backbone network followed by a bounding box module. Specifically, the backbone network utilizes the OA mechanism to extract global features and employs SSC for local feature extraction. The bounding box module then conducts instance segmentation based on the extracted features. Experimental results demonstrate that our approach outperforms existing work on the S3DIS dataset while maintaining computational efficiency. TSPconv-Net achieves 68.6% mPrec, 52.5% mRec, and 60.1% mAP on the test set, surpassing 3D-BoNet by 3.0% mPrec, 5.4% mRec, and 2.6% mAP. Furthermore, it demonstrates high efficiency, completing computations in just 326 s. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. A New Instance Segmentation Model for High-Resolution Remote Sensing Images Based on Edge Processing.
- Author
-
Zhang, Xiaoying, Shen, Jie, Hu, Huaijin, and Yang, Houqun
- Subjects
- *
REMOTE sensing , *FEATURE extraction , *IMAGE segmentation - Abstract
With the goal of addressing the challenges of small, densely packed targets in remote sensing images, we propose a high-resolution instance segmentation model named QuadTransPointRend Net (QTPR-Net). This model significantly enhances instance segmentation performance in remote sensing images. The model consists of two main modules: preliminary edge feature extraction (PEFE) and edge point feature refinement (EPFR). We also created a specific approach and strategy named TransQTA for edge uncertainty point selection and feature processing in high-resolution remote sensing images. Multi-scale feature fusion and transformer technologies are used in QTPR-Net to refine rough masks and fine-grained features for selected edge uncertainty points while balancing model size and accuracy. Based on experiments performed on three public datasets: NWPU VHR-10, SSDD, and iSAID, we demonstrate the superiority of QTPR-Net over existing approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. The Development of a Yolov8-Based Model for the Measurement of Critical Shoulder Angle (CSA), Lateral Acromion Angle (LAA), and Acromion Index (AI) from Shoulder X-ray Images.
- Author
-
Selçuk, Turab
- Subjects
- *
SHOULDER joint , *STATISTICAL measurement , *SHOULDER disorders , *ACROMION , *X-ray imaging - Abstract
Background: The accurate and effective evaluation of parameters such as critical shoulder angle, lateral acromion angle, and acromion index from shoulder X-ray images is crucial for identifying pathological changes and assessing disease risk in the shoulder joint. Methods: In this study, a YOLOv8-based model was developed to automatically measure these three parameters together, contributing to the existing literature. Initially, YOLOv8 was used to segment the acromion, glenoid, and humerus regions, after which the CSA, LAA angles, and AI between these regions were calculated. The MURA dataset was employed in this study. Results: Segmentation performance was evaluated with the Dice and Jaccard similarity indices, both exceeding 0.9. Statistical analyses of the measurement performance, including Pearson correlation coefficient, RMSE, and ICC values demonstrated that the proposed model exhibits high consistency and similarity with manual measurements. Conclusions: The results indicate that automatic measurement methods align with manual measurements with high accuracy and offer an effective alternative for clinical applications. This study provides valuable insights for the early diagnosis and management of shoulder diseases and makes a significant contribution to existing measurement methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. An Improved Instance Segmentation Method for Complex Elements of Farm UAV Aerial Survey Images.
- Author
-
Lv, Feixiang, Zhang, Taihong, Zhao, Yunjie, Yao, Zhixin, and Cao, Xinyu
- Subjects
- *
FARM mechanization , *AERIAL surveys , *IMAGE processing , *FARMS , *PYRAMIDS - Abstract
Farm aerial survey layers can assist in unmanned farm operations, such as planning paths and early warnings. To address the inefficiencies and high costs associated with traditional layer construction, this study proposes a high-precision instance segmentation algorithm based on SparseInst. Considering the structural characteristics of farm elements, this study introduces a multi-scale attention module (MSA) that leverages the properties of atrous convolution to expand the sensory field. It enhances spatial and channel feature weights, effectively improving segmentation accuracy for large-scale and complex targets in the farm through three parallel dense connections. A bottom-up aggregation path is added to the feature pyramid fusion network, enhancing the model's ability to perceive complex targets such as mechanized trails in farms. Coordinate attention blocks (CAs) are incorporated into the neck to capture richer contextual semantic information, enhancing farm aerial imagery scene recognition accuracy. To assess the proposed method, we compare it against existing mainstream object segmentation models, including the Mask R-CNN, Cascade–Mask, SOLOv2, and Condinst algorithms. The experimental results show that the improved model proposed in this study can be adapted to segment various complex targets in farms. The accuracy of the improved SparseInst model greatly exceeds that of Mask R-CNN and Cascade–Mask and is 10.8 and 12.8 percentage points better than the average accuracy of SOLOv2 and Condinst, respectively, with the smallest number of model parameters. The results show that the model can be used for real-time segmentation of targets under complex farm conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. DIO-SLAM: A Dynamic RGB-D SLAM Method Combining Instance Segmentation and Optical Flow.
- Author
-
He, Lang, Li, Shiyun, Qiu, Junting, and Zhang, Chenhaomin
- Subjects
- *
OPTICAL flow , *POINT cloud , *THREAD (Textiles) , *ALGORITHMS - Abstract
Feature points from moving objects can negatively impact the accuracy of Visual Simultaneous Localization and Mapping (VSLAM) algorithms, while detection or semantic segmentation-based VSLAM approaches often fail to accurately determine the true motion state of objects. To address this challenge, this paper introduces DIO-SLAM: Dynamic Instance Optical Flow SLAM, a VSLAM system specifically designed for dynamic environments. Initially, the detection thread employs YOLACT (You Only Look At CoefficienTs) to distinguish between rigid and non-rigid objects within the scene. Subsequently, the optical flow thread estimates optical flow and introduces a novel approach to capture the optical flow of moving objects by leveraging optical flow residuals. Following this, an optical flow consistency method is implemented to assess the dynamic nature of rigid object mask regions, classifying them as either moving or stationary rigid objects. To mitigate errors caused by missed detections or motion blur, a motion frame propagation method is employed. Lastly, a dense mapping thread is incorporated to filter out non-rigid objects using semantic information, track the point clouds of rigid objects, reconstruct the static background, and store the resulting map in an octree format. Experimental results demonstrate that the proposed method surpasses current mainstream dynamic VSLAM techniques in both localization accuracy and real-time performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. High-Precision Automated Soybean Phenotypic Feature Extraction Based on Deep Learning and Computer Vision.
- Author
-
Zhang, Qi-Yuan, Fan, Ke-Jun, Tian, Zhixi, Guo, Kai, and Su, Wen-Hao
- Subjects
TIME complexity ,COMPUTER vision ,FEATURE extraction ,DEEP learning ,PLANT anatomy - Abstract
The automated collection of plant phenotypic information has become a trend in breeding and smart agriculture. Four YOLOv8-based models were used to segment mature soybean plants placed in a simple background in a laboratory environment, identify pods, distinguish the number of soybeans in each pod, and obtain soybean phenotypes. The YOLOv8-Repvit model yielded the most optimal recognition results, with an R 2 coefficient value of 0.96 for both pods and beans, and the RMSE values were 2.89 and 6.90, respectively. Moreover, a novel algorithm was devised to efficiently differentiate between the main stem and branches of soybean plants, called the midpoint coordinate algorithm (MCA). This was accomplished by linking the white pixels representing the stems in each column of the binary image to draw curves that represent the plant structure. The proposed method reduces computational time and spatial complexity in comparison to the A* algorithm, thereby providing an efficient and accurate approach for measuring the phenotypic characteristics of soybean plants. This research lays a technical foundation for obtaining the phenotypic data of densely overlapped and partitioned mature soybean plants under field conditions at harvest. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. An instance segmentation model based on improved SOLOv2 and Chan–Vese.
- Author
-
Zou, Le, Wang, Chengcheng, Wu, Zhize, Sun, Lingma, and Wang, Xiaofeng
- Abstract
The classical instance segmentation model has the problems of obtaining incomplete feature context information and performing rough segmentation edge smoothing refinement processing, which reduces the segmentation accuracy. To solve these problems, we propose a box-supervised instance segmentation model based on the improved SOLOv2 and Chan–Vese level set method. Firstly, the dilated convolution is introduced into the dynamic convolution kernel prediction module of the SOLOv2 model. The improved SOLOv2 mask-supervised model is used to predict the instance mask, which enlarges the sensing field and obtains rich contextual feature information. Secondly, the box projection function is introduced and utilized to map the instance mask to the initial contour of the Chan–Vese model, thus achieving box-supervised instance segmentation. Finally, an improved length regularization term is added to the Chan–Vese functional to make the object contour edges smoother and segment the object contour effectively. The experimental results show that the proposed instance segmentation model obtains 39.4%, 32.6%, and 22.4% of the masked mAP on the three datasets of Pascal VOC, COCO, and Cityscapes, respectively, which verifies that the proposed method has a better performance for image edge segmentation in general scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. DAMM for the detection and tracking of multiple animals within complex social and environmental settings.
- Author
-
Kaul, Gaurav, McDevitt, Jonathan, Johnson, Justin, and Eban-Rothschild, Ada
- Subjects
- *
ANIMAL tracks , *ARTIFICIAL satellite tracking , *COMPUTER vision , *RATS , *ANIMAL behavior - Abstract
Accurate detection and tracking of animals across diverse environments are crucial for studying brain and behavior. Recently, computer vision techniques have become essential for high-throughput behavioral studies; however, localizing animals in complex conditions remains challenging due to intra-class visual variability and environmental diversity. These challenges hinder studies in naturalistic settings, such as when animals are partially concealed within nests. Moreover, current tools are laborious and time-consuming, requiring extensive, setup-specific annotation and training procedures. To address these challenges, we introduce the 'Detect-Any-Mouse-Model' (DAMM), an object detector for localizing mice in complex environments with minimal training. Our approach involved collecting and annotating a diverse dataset of single- and multi-housed mice in complex setups. We trained a Mask R-CNN, a popular object detector in animal studies, to perform instance segmentation and validated DAMM's performance on a collection of downstream datasets using zero-shot and few-shot inference. DAMM excels in zero-shot inference, detecting mice and even rats, in entirely unseen scenarios and further improves with minimal training. Using the SORT algorithm, we demonstrate robust tracking, competitive with keypoint-estimation-based methods. Notably, to advance and simplify behavioral studies, we release our code, model weights, and data, along with a user-friendly Python API and a Google Colab implementation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. TP-Transfiner: high-quality segmentation network for tea pest.
- Author
-
Ruizhao Wu, Feng He, Ziyang Rong, Zhixue Liang, Wenxing Xu, Fuchuan Ni, and Wenyong Dong
- Subjects
PEST control ,FEATURE extraction ,PESTS ,TEA ,GARDENS - Abstract
Detecting and controlling tea pests promptly are crucial for safeguarding tea production quality. Due to the insufficient feature extraction ability of traditional CNN-based methods, they face challenges such as inaccuracy and inefficiency of detecting pests in dense and mimicry scenarios. This study proposes an end-to-end tea pest detection and segmentation framework, TeaPest-Transfiner (TP-Transfiner), based on Mask Transfiner to address the challenge of detecting and segmenting pests in mimicry and dense scenarios. In order to improve the feature extraction inability and weak accuracy of traditional convolution modules, this study proposes three strategies. Firstly, a deformable attention block is integrated into the model, which consists of deformable convolution and self-attention using the key content only term. Secondly, the FPN architecture in the backbone network is improved with a more effective feature-aligned pyramid network (FaPN). Lastly, focal loss is employed to balance positive and negative samples during the training period, and parameters are adapted to the dataset distribution. Furthermore, to address the lack of tea pest images, a dataset called TeaPestDataset is constructed, which contains 1,752 images and 29 species of tea pests. Experimental results on the TeaPestDataset show that the proposed TP-Transfiner model achieves state-of-the-art performance compared with other models, attaining a detection precision (AP50) of 87.211% and segmentation performance of 87.381%. Notably, the model shows a significant improvement in segmentation average precision (mAP) by 9.4% and a reduction in model size by 30% compared to the state-of-the-art CNN-based model Mask R-CNN. Simultaneously, TP-Transfiner's lightweight module fusion maintains fast inference speeds and a compact model size, demonstrating practical potential for pest control in tea gardens, especially in dense and mimicry scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Instance segmentation by blend U‐Net and VOLO network.
- Author
-
Deng, Hongfei, Wen, Bin, Wang, Rui, and Feng, Zuwei
- Abstract
Instance segmentation is still challengeable to correctly distinguish different instances on overlapping, dense and large number of target objects. To address this, the authors simplify the instance segmentation problem to an instance classification problem and propose a novel end‐to‐end trained instance segmentation algorithm CotuNet. Firstly, the algorithm combines convolutional neural networks (CNN), Outlooker and Transformer to design a new hybrid Encoder (COT) to further feature extraction. It consists of extracting low‐level features of the image using CNN, which is passed through the Outlooker to extract more refined local data representations. Then global contextual information is generated by aggregating the data representations in local space using Transformer. Finally, the combination of cascaded upsampling and skip connection modules is used as Decoders (C‐UP) to enable the blend of multiple different scales of high‐resolution information to generate accurate masks. By validating on the CVPPP 2017 dataset and comparing with previous state‐of‐the‐art methods, CotuNet shows superior competitiveness and segmentation performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Algorithm for Locating Apical Meristematic Tissue of Weeds Based on YOLO Instance Segmentation.
- Author
-
Zhang, Daode, Lu, Rui, Guo, Zhe, Yang, Zhiyong, Wang, Siqi, and Hu, Xinyu
- Subjects
- *
WEED control , *WEEDS , *GENERALIZATION , *ALGORITHMS , *LASERS - Abstract
Laser technology can be used to control weeds by irradiating the apical meristematic tissue (AMT) of weeds when they are still seedlings. Two factors are necessary for the successful large-scale implementation of this technique: the ability to accurately identify the apical meristematic tissue and the effectiveness of the localization algorithm used in the process. Based on this, this study proposes a lightweight weed AMT localization algorithm based on YOLO (look only once) instance segmentation. The YOLOv8n-seg network undergoes a lightweight design enhancement by integrating the FasterNet lightweight network as its backbone, resulting in the F-YOLOv8n-seg model. This modification effectively reduces the number of parameters and computational demands during the convolution process, thereby achieving a more efficient model. Subsequently, F-YOLOv8n-seg is combined with the connected domain analysis algorithm (CDA), yielding the F-YOLOv8n-seg-CDA model. This integration enables the precise localization of the AMT of weeds by calculating the center-of-mass coordinates of the connected domains. The experimental results indicate that the optimized model significantly outperforms the original model; the optimized model reduces floating-point computations by 26.7% and the model size by 38.2%. In particular, the floating-point calculation is decreased to 8.9 GFLOPs, and the model size is lowered to 4.2 MB. Comparing this improved model against YOLOv5s-seg and YOLOv10n-seg, it is lighter. Furthermore, it exhibits exceptional segmentation accuracy, with a 97.2% accuracy rate. Experimental tests conducted on five different weed species demonstrated that F-YOLOv8n-seg-CDA exhibits strong generalization capabilities. The combined accuracy of the algorithm for detecting these weeds was 81%. Notably, dicotyledonous weeds were detected with up to 94%. Additionally, the algorithm achieved an average inference speed of 82.9 frames per second. These results indicate that the algorithm is suitable for the real-time detection of apical meristematic tissues across multiple weed species. Furthermore, the experimental results demonstrated the impact of distinctive variations in weed morphology on identifying the location of the AMT of weeds. It was discovered that dicotyledonous and monocotyledonous weeds differed significantly in terms of the detection effect, with dicotyledonous weeds having significantly higher detection accuracy than monocotyledonous weeds. This discovery can offer novel insights and avenues for future investigation into the identification and location of the AMT of weeds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. An Efficient and Low-Cost Deep Learning-Based Method for Counting and Sizing Soybean Nodules.
- Author
-
Wang, Xueying, Yu, Nianping, Sun, Yongzhe, Guo, Yixin, Pan, Jinchao, Niu, Jiarui, Liu, Li, Chen, Hongyu, Cao, Junzhuo, Cao, Haifeng, Chen, Qingshan, Xin, Dawei, and Zhu, Rongsheng
- Subjects
- *
NITROGEN fixation , *ROOT-tubercles , *IMAGE segmentation , *DEEP learning , *PLANT growth - Abstract
Soybeans are an essential source of food, protein, and oil worldwide, and the nodules on their root systems play a critical role in nitrogen fixation and plant growth. In this study, we tackled the challenge of limited high-resolution image quantities and the constraints on model learning by innovatively employing image segmentation technology for an in-depth analysis of soybean nodule phenomics. Through a meticulously designed segmentation algorithm, we broke down large-resolution images into numerous smaller ones, effectively improving the model's learning efficiency and significantly increasing the available data volume, thus laying a solid foundation for subsequent analysis. In terms of model selection and optimization, after several rounds of comparison and testing, YOLOX was identified as the optimal model, achieving an accuracy of 91.38% on the test set with an R2 of up to 86%, fully demonstrating its efficiency and reliability in nodule counting tasks. Subsequently, we utilized YOLOV5 for instance segmentation, achieving a precision of 93.8% in quickly and accurately extracting key phenotypic indicators such as the area, circumference, length, and width of the nodules, and calculated the statistical properties of these indicators. This provided a wealth of quantitative data for the morphological study of soybean nodules. The research not only enhanced the efficiency and accuracy of obtaining nodule phenotypic data and reduced costs but also provided important scientific evidence for the selection and breeding of soybean materials, highlighting its potential application value in agricultural research and practical production. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Application of Instance Segmentation to Identifying Insect Concentrations in Data from an Entomological Radar.
- Author
-
Wang, Rui, Ren, Jiahao, Li, Weidong, Yu, Teng, Zhang, Fan, and Wang, Jiangtao
- Subjects
- *
INSECT behavior , *FEATURE extraction , *RADAR , *INSECTS , *DATA extraction - Abstract
Entomological radar is one of the most effective tools for monitoring insect migration, capable of detecting migratory insects concentrated in layers and facilitating the analysis of insect migration behavior. However, traditional entomological radar, with its low resolution, can only provide a rough observation of layer concentrations. The advent of High-Resolution Phased Array Radar (HPAR) has transformed this situation. With its high range resolution and high data update rate, HPAR can generate detailed concentration spatiotemporal distribution heatmaps. This technology facilitates the detection of changes in insect concentrations across different time periods and altitudes, thereby enabling the observation of large-scale take-off, landing, and layering phenomena. However, the lack of effective techniques for extracting insect concentration data of different phenomena from these heatmaps significantly limits detailed analyses of insect migration patterns. This paper is the first to apply instance segmentation technology to the extraction of insect data, proposing a method for segmenting and extracting insect concentration data from spatiotemporal distribution heatmaps at different phenomena. To address the characteristics of concentrations in spatiotemporal distributions, we developed the Heatmap Feature Fusion Network (HFF-Net). In HFF-Net, we incorporate the Global Context (GC) module to enhance feature extraction of concentration distributions, utilize the Atrous Spatial Pyramid Pooling with Depthwise Separable Convolution (SASPP) module to extend the receptive field for understanding various spatiotemporal distributions of concentrations, and refine segmentation masks with the Deformable Convolution Mask Fusion (DCMF) module to enhance segmentation detail. Experimental results show that our proposed network can effectively segment concentrations of different phenomena from heatmaps, providing technical support for detailed and systematic studies of insect migration behavior. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Lesion Localization and Pathological Diagnosis of Ovine Pulmonary Adenocarcinoma Based on MASK R-CNN.
- Author
-
Chen, Sixu, Zhang, Pei, Duan, Xujie, Bao, Anyu, Wang, Buyu, Zhang, Yufei, Li, Huiping, Zhang, Liang, and Liu, Shuying
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *DEEP learning , *ARTIFICIAL intelligence , *ANIMAL industry , *LUNGS - Abstract
Simple Summary: In this study, a Common Objects in Context dataset of ovine pulmonary adenocarcinoma pathological images was constructed based on 7167 annotated typical lesions from 61,063 lung pathological images of ovine pulmonary adenocarcinoma. This study aimed to develop a mask regional convolutional neural network model for the localization and pathological diagnosis of ovine pulmonary adenocarcinoma lesions. The model achieved a mean average specificity of 0.573 and an average sensitivity of 0.745, with consistency rates of 100% for junior pathologists and 96.5% for senior pathologists in the diagnosis of ovine pulmonary adenocarcinoma. The successful development of this model not only facilitates the rapid diagnosis of ovine pulmonary adenocarcinoma by different personnel in practical applications but also lays a foundation for the transition from traditional pathology to digital pathology in the livestock industry. Ovine pulmonary adenocarcinoma (OPA) is a contagious lung tumour caused by the Jaagsiekte Sheep Retrovirus (JSRV). Histopathological diagnosis is the gold standard for OPA diagnosis. However, interpretation of traditional pathology images is complex and operator dependent. The mask regional convolutional neural network (Mask R-CNN) has emerged as a valuable tool in pathological diagnosis. This study utilized 54 typical OPA whole slide images (WSI) to extract 7167 typical lesion images containing OPA to construct a Common Objects in Context (COCO) dataset for OPA pathological images. The dataset was categorized into training and test sets (8:2 ratio) for model training and validation. Mean average specificity (mASp) and average sensitivity (ASe) were used to evaluate model performance. Six WSI-level pathological images (three OPA and three non-OPA images), not included in the dataset, were used for anti-peeking model validation. A random selection of 500 images, not included in the dataset establishment, was used to compare the performance of the model with assessment by pathologists. Accuracy, sensitivity, specificity, and concordance rate were evaluated. The model achieved a mASp of 0.573 and an ASe of 0.745, demonstrating effective lesion detection and alignment with expert annotation. In Anti-Peeking verification, the model showed good performance in locating OPA lesions and distinguished OPA from non-OPA pathological images. In the random 500-image diagnosis, the model achieved 92.8% accuracy, 100% sensitivity, and 88% specificity. The agreement rates between junior and senior pathologists were 100% and 96.5%, respectively. In conclusion, the Mask R-CNN-based OPA diagnostic model developed for OPA facilitates rapid and accurate diagnosis in practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. GHA-Inst: a real-time instance segmentation model utilizing YOLO detection framework.
- Author
-
Dong, Chengang, Tang, Yuhao, and Zhang, Liyan
- Subjects
- *
DEEP learning , *NECK , *NOISE , *VIDEOS - Abstract
The real-time instance segmentation task based on deep learning aims to accurately identify and distinguish all instance objects from images or videos. However, due to the existence of problems such as mutual occlusion between instances, limitations in model receptive fields, etc., achieving accurate and real-time segmentation continues to pose a formidable challenge. To alleviate the aforementioned issues, this paper proposes a real-time instance segmentation method based on a dual-branch structure, called GHA-Inst. Specifically, we made improvements to the feature fusion module (Neck) and output end (Head) of the YOLOv7-seg real-time instance segmentation framework to mitigate the accuracy reduction caused by feature loss and reduce the interference of background noise on the model. Secondly, we introduced a Global Hybrid-Domain Attention (GHA) module to improve the model's focus on significant information while retaining more original spatial features, alleviate incomplete segmentation caused by instance occlusion, and improve the quality of generated masks. Finally, our method achieved competitive results on multiple metrics of the MS COCO 2017 and KINS open-source datasets. Compared with the YOLOv7-seg baseline model, GHA-Inst improved the average precision (AP) by 3.4% and 2.6% on the two datasets, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Real-Time Instance Segmentation Method Based on Location Attention.
- Author
-
Li Liu and Yuqi Kong
- Subjects
OBJECT recognition (Computer vision) ,COMPUTER vision ,VISUAL fields ,DEEP learning ,PYRAMIDS - Abstract
Instance segmentation is a challenging research in the field of computer vision, which combines the prediction results of object detection and semantic segmentation to provide richer image feature information. Focusing on the instance segmentation in the street scene, the real-time instance segmentation method based on SOLOv2 is proposed in this paper. First, a cross-stage fusion backbone network based on position attention is designed to increase the model accuracy and reduce the computational effort. Then, the loss of shallow location information is decreased by integrating two-way feature pyramid networks. Meanwhile, cross-stage mask feature fusion is designed to resolve the small objects missed segmentation. Finally, the adaptive minimum loss matching method is proposed to decrease the loss of segmentation accuracy due to object occlusion in the image. Compared with other mainstream methods, our method meets the real-time segmentation requirements and achieves competitive performance in segmentation accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Open-set marine object instance segmentation with prototype learning.
- Author
-
Hu, Xing, Li, Panlong, Karimi, Hamid Reza, Jiang, Linhua, and Zhang, Dawei
- Abstract
The ocean world is full of Unknown Marine Objects (UMOs), making it difficult to deal with unknown ocean targets using the traditional instance segmentation model. This is because the traditional instance segmentation networks are trained on a closed dataset, assuming that all detected objects are Known Marine Objects (KMOs). Consequently, traditional closed-set networks often misclassify UMOs as KMOs. To address this problem, this paper proposes a new open-set instance segmentation model for object instance segmentation in marine environments with UMOs. Specifically, we integrate two learning modules in the model, namely a prototype module and an unknown learning module. Through the learnable prototype, the prototype module improves the class's compactness and boundary detection capabilities while also increasing the classification accuracy. Through the uncertainty of low probability samples, the unknown learning module forecasts the unknown probability. Experimental results illustrate that the proposed method has competitive known class recognition accuracy compared to existing instance segmentation models, and can accurately distinguish unknown targets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. 基于语义增广与YOLOv8的钢轨 表面缺陷检测方法.
- Author
-
吴永军, 崔灿, and 何永福
- Abstract
Copyright of Journal of Railway Science & Engineering is the property of Journal of Railway Science & Engineering Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
42. CIS: A Coral Instance Segmentation Network Model with Novel Upsampling, Downsampling, and Fusion Attention Mechanism.
- Author
-
Li, Tianrun, Liang, Zhengyou, and Zhao, Shuqi
- Subjects
CORAL reefs & islands ,CORALS ,DEEP learning ,MORPHOLOGY ,BRASSIERES - Abstract
Coral segmentation poses unique challenges due to its irregular morphology and camouflage-like characteristics. These factors often result in low precision, large model parameters, and poor real-time performance. To address these issues, this paper proposes a novel coral instance segmentation (CIS) network model. Initially, we designed a novel downsampling module, ADown_HWD, which operates at multiple resolution levels to extract image features, thereby preserving crucial information about coral edges and textures. Subsequently, we integrated the bi-level routing attention (BRA) mechanism into the C2f module to form the C2f_BRA module within the neck network. This module effectively removes redundant information, enhancing the ability to distinguish coral features and reducing computational redundancy. Finally, dynamic upsampling, Dysample, was introduced into the CIS to better retain the rich semantic and key feature information of corals. Validation on our self-built dataset demonstrated that the CIS network model significantly outperforms the baseline YOLOv8n model, with improvements of 6.3% and 10.5% in P
B and PM and 2.3% and 2.4% in mAP50B and mAP50M , respectively. Furthermore, the reduction in model parameters by 10.1% correlates with a notable 10.7% increase in frames per second (FPS) to 178.6, thus effectively meeting real-time operational requirements. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
43. Improvement of rotated object detection and instance segmentation in warship satellite remote sensing images based on convolutional neural network.
- Author
-
Kaifa, Ding, Yang, Yang, Jianwu, Mu, and Kaixuan, Hu
- Subjects
CONVOLUTIONAL neural networks ,OBJECT recognition (Computer vision) ,REMOTE sensing ,PROBLEM solving ,PIXELS - Abstract
Object detection and instance segmentation networks are improved to realise the accurate detection and instance segmentation of rotated warship objects in satellite remote sensing images. An adaptive threshold generation scheme and segmentation annotation information are applied used to improve a rotated label generation method to obtain high-precision rotated object labels. The original RPN is combined with the bbox head with improved output dimensions to obtain a rotated RPN to generate rotated region proposals. Rotated RoIAlign is used to solve the problem of mismatch between rotated region proposals and dimensions of subsequent feature maps. A rotated detection frame is used to correct the output of the network, which alleviates false detection and omission. In addition, this removes the pixels outside the rotated detection frame that are incorrectly classified as objects. The improved networks can achieve high-precision detection and instance segmentation of rotated warship objects, and the methods used in this study have good generalisability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Instance Segmentation of Characters Recognized in Palmyrene Aramaic Inscriptions.
- Author
-
Hamplová, Adéla, Lyavdansky, Alexey, Novák, Tomáš, Svojše, Ondřej, Franc, David, and Veselý, Arnošt
- Subjects
OPTICAL character recognition ,MACHINE learning ,COMPUTER vision ,PROGRAMMING languages ,HISTORIC preservation ,DEEP learning - Abstract
This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions, employing two state-of-the-art deep learning algorithms, namely YOLOv8 and Roboflow 3.0. The goal is to contribute to the preservation and understanding of historical texts, showcasing the potential of modern deep learning methods in archaeological research. Our research culminates in several key findings and scientific contributions. We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context. We also created and annotated an extensive dataset of Palmyrene inscriptions, a crucial resource for further research in the field. The dataset serves for training and evaluating the segmentation models. We employ comparative evaluation metrics to quantitatively assess the segmentation results, ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks. Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research. The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. A Method for Sorting High-Quality Fresh Sichuan Pepper Based on a Multi-Domain Multi-Scale Feature Fusion Algorithm.
- Author
-
Xiang, Pengjun, Pan, Fei, Duan, Xuliang, Yang, Daizhuang, Hu, Mengdie, He, Dawei, Zhao, Xiaoyu, and Huang, Fang
- Subjects
COMPUTER vision ,AGRICULTURAL processing ,MANUFACTURING processes ,PEPPERS ,ALGORITHMS - Abstract
Post-harvest selection of high-quality Sichuan pepper is a critical step in the production process. To achieve this, a visual system needs to analyze Sichuan pepper with varying postures and maturity levels. To quickly and accurately sort high-quality fresh Sichuan pepper, this study proposes a multi-scale frequency domain feature fusion module (MSF3M) and a multi-scale dual-domain feature fusion module (MS-DFFM) to construct a multi-scale, multi-domain fusion algorithm for feature fusion of Sichuan pepper images. The MultiDomain YOLOv8 Model network is then built to segment and classify the target Sichuan pepper, distinguishing the maturity level of individual Sichuan peppercorns. A selection method based on the average local pixel value difference is proposed for sorting high-quality fresh Sichuan pepper. Experimental results show that the MultiDomain YOLOv8-seg achieves an m A P 50 of 88.8% for the segmentation of fresh Sichuan pepper, with a model size of only 5.84 MB. The MultiDomain YOLOv8-cls excels in Sichuan pepper maturity classification, with an accuracy of 98.34%. Compared to the YOLOv8 baseline model, the MultiDomain YOLOv8 model offers higher accuracy and a more lightweight structure, making it highly effective in reducing misjudgments and enhancing post-harvest processing efficiency in agricultural applications, ultimately increasing producer profits. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Instance segmentation from small dataset by a dual-layer semantics-based deep learning framework.
- Author
-
Chen, YiMing, Li, JianWei, Hu, XiaoBing, Liu, YiRui, Ma, JianKai, Xing, Chen, Li, JunJie, Wang, ZhiJun, and Wang, JinCheng
- Abstract
Efficient and accurate segmentation of complex microstructures is a critical challenge in establishing process-structure-property (PSP) linkages of materials. Deep learning (DL)-based instance segmentation algorithms show potential in achieving this goal. However, to ensure prediction reliability, the current algorithms usually have complex structures and demand vast training data. To overcome the model complexity and its dependence on the amount of data, we developed an ingenious DL framework based on a simple method called dual-layer semantics. In the framework, a data standardization module was designed to remove extraneous microstructural noise and accentuate desired structural characteristics, while a post-processing module was employed to further improve segmentation accuracy. The framework was successfully applied in a small dataset of bimodal Ti-6Al-4V microstructures with only 112 samples. Compared with the ground truth, it realizes an 86.81% accuracy IoU for the globular α phase and a 94.70% average size distribution similarity for the colony structures. More importantly, only 36 s was taken to handle a 1024 × 1024 micrograph, which is much faster than the treatment of experienced experts (usually 900 s). The framework proved reliable, interpretable, and scalable, enabling its utilization in complex microstructures to deepen the understanding of PSP linkages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Integrated Scale-Adaptive Adjustment Factor-Enhanced BlendMask Method for Pineapple Processing System.
- Author
-
Wang, Haotian, Zhang, Haojian, Zhang, Yukai, Deng, Jieren, Liu, Chengbao, and Tan, Jie
- Subjects
COMPUTER vision ,WASTE minimization ,DEEP learning ,MACHINE learning ,FOOD industry ,PINEAPPLE - Abstract
This study addresses the challenge of efficiently peeling pineapples, which have a distinct elliptical form, thick skin, and small eyes that are difficult to detect with conventional automated methods. This results in significant flesh waste. To improve the process, we developed an integrated system combining an enhanced BlendMask method, termed SAAF-BlendMask, and a Pose Correction Planning (PCP) method. SAAF-BlendMask improves the detection of small pineapple eyes, while PCP ensures accurate posture adjustment for precise path planning. The system uses 3D vision and deep learning technologies, achieving an average precision (AP) of 73.04% and a small object precision (APs) of 62.54% in eye detection, with a path planning success rate reaching 99%. The fully automated electromechanical system was tested on 110 real pineapples, demonstrating a reduction in flesh waste by 11.7% compared to traditional methods. This study highlights the potential of advanced machine vision and robotics in enhancing the efficiency and precision of food processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. A dual-labeled dataset and fusion model for automatic teeth segmentation, numbering, and state assessment on panoramic radiographs
- Author
-
Wenbo Zhou, Xin Lu, Dan Zhao, Meng Jiang, Linlin Fan, Weihang Zhang, Fenglin Li, Dezhou Wang, Weihuang Yin, and Xin Liu
- Subjects
Deep learning ,Instance segmentation ,Classification ,Panoramic radiograph ,Teeth segmentation ,Dentistry ,RK1-715 - Abstract
Abstract Background Recently, deep learning has been increasingly applied in the field of dentistry. The aim of this study is to develop a model for the automatic segmentation, numbering, and state assessment of teeth on panoramic radiographs. Methods We created a dual-labeled dataset on panoramic radiographs for training, incorporating both numbering and state labels. We then developed a fusion model that combines a YOLOv9-e instance segmentation model with an EfficientNetv2-l classification model. The instance segmentation model is used for tooth segmentation and numbering, whereas the classification model is used for state evaluation. The final prediction results integrate tooth position, numbering, and state information. The model’s output includes result visualization and automatic report generation. Results Precision, Recall, mAP50 (mean Average Precision), and mAP50-95 for the tooth instance segmentation task are 0.989, 0.955, 0.975, and 0.840, respectively. Precision, Recall, Specificity, and F1 Score for the tooth classification task are 0.943, 0.933, 0.985, and 0.936, respectively. Conclusions This fusion model is the first to integrate automatic dental segmentation, numbering, and state assessment. It provides highly accurate results, including detailed visualizations and automated report generation.
- Published
- 2024
- Full Text
- View/download PDF
49. Instance segmentation by blend U‐Net and VOLO network
- Author
-
Hongfei Deng, Bin Wen, Rui Wang, and Zuwei Feng
- Subjects
colouring ,computer vision ,COT Encoder ,instance segmentation ,VOLO ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Instance segmentation is still challengeable to correctly distinguish different instances on overlapping, dense and large number of target objects. To address this, the authors simplify the instance segmentation problem to an instance classification problem and propose a novel end‐to‐end trained instance segmentation algorithm CotuNet. Firstly, the algorithm combines convolutional neural networks (CNN), Outlooker and Transformer to design a new hybrid Encoder (COT) to further feature extraction. It consists of extracting low‐level features of the image using CNN, which is passed through the Outlooker to extract more refined local data representations. Then global contextual information is generated by aggregating the data representations in local space using Transformer. Finally, the combination of cascaded upsampling and skip connection modules is used as Decoders (C‐UP) to enable the blend of multiple different scales of high‐resolution information to generate accurate masks. By validating on the CVPPP 2017 dataset and comparing with previous state‐of‐the‐art methods, CotuNet shows superior competitiveness and segmentation performance.
- Published
- 2024
- Full Text
- View/download PDF
50. DAMM for the detection and tracking of multiple animals within complex social and environmental settings
- Author
-
Gaurav Kaul, Jonathan McDevitt, Justin Johnson, and Ada Eban-Rothschild
- Subjects
Animal behavior ,Animal tracking ,Computer vision ,Generalization ,Instance segmentation ,Medicine ,Science - Abstract
Abstract Accurate detection and tracking of animals across diverse environments are crucial for studying brain and behavior. Recently, computer vision techniques have become essential for high-throughput behavioral studies; however, localizing animals in complex conditions remains challenging due to intra-class visual variability and environmental diversity. These challenges hinder studies in naturalistic settings, such as when animals are partially concealed within nests. Moreover, current tools are laborious and time-consuming, requiring extensive, setup-specific annotation and training procedures. To address these challenges, we introduce the 'Detect-Any-Mouse-Model' (DAMM), an object detector for localizing mice in complex environments with minimal training. Our approach involved collecting and annotating a diverse dataset of single- and multi-housed mice in complex setups. We trained a Mask R-CNN, a popular object detector in animal studies, to perform instance segmentation and validated DAMM’s performance on a collection of downstream datasets using zero-shot and few-shot inference. DAMM excels in zero-shot inference, detecting mice and even rats, in entirely unseen scenarios and further improves with minimal training. Using the SORT algorithm, we demonstrate robust tracking, competitive with keypoint-estimation-based methods. Notably, to advance and simplify behavioral studies, we release our code, model weights, and data, along with a user-friendly Python API and a Google Colab implementation.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.