2,941 results on '"stereo matching"'
Search Results
2. Review of stereo matching based on deep learning
- Author
-
Zhang, Shangshang, Su, Weixing, Liu, Fang, and Sun, Lincheng
- Published
- 2025
- Full Text
- View/download PDF
3. DCVSMNet: Double Cost Volume Stereo Matching Network
- Author
-
Tahmasebi, Mahmoud, Huq, Saif, Meehan, Kevin, and McAfee, Marion
- Published
- 2025
- Full Text
- View/download PDF
4. A two-stage 3D multi-fish tracking model using patch-based underwater stereo matching
- Author
-
Li, Yuxiang, Tan, Hequn, Deng, Yuxuan, Zhou, Dianzhuo, and Zhu, Ming
- Published
- 2025
- Full Text
- View/download PDF
5. As-Global-As-Possible stereo matching with Sparse Depth Measurement Fusion
- Author
-
Yao, Peng and Sang, Haiwei
- Published
- 2025
- Full Text
- View/download PDF
6. MCF-SMSIS: Multi-tasking with complementary functions for stereo matching and surgical instrument segmentation
- Author
-
Wu, Renkai, He, Changyu, Liang, Pengchen, Liu, Yinghao, Huang, Yiqi, Liu, Weiping, Shu, Biao, Xu, Panlong, and Chang, Qing
- Published
- 2024
- Full Text
- View/download PDF
7. Multi-line structured light stripes clustering based on a custom iterative window
- Author
-
Li, Wenguo, Deng, Haibo, Deng, Zhipeng, and Wu, Xingang
- Published
- 2024
- Full Text
- View/download PDF
8. Depth cue fusion for event-based stereo depth estimation
- Author
-
Ghosh, Dipon Kumar and Jung, Yong Ju
- Published
- 2025
- Full Text
- View/download PDF
9. Edge-Guided Fusion and Motion Augmentation for Event-Image Stereo
- Author
-
Zhao, Fengan, Zhou, Qianang, Xiong, Junlin, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
10. Temporally Consistent Stereo Matching
- Author
-
Zeng, Jiaxi, Yao, Chengtang, Wu, Yuwei, Jia, Yunde, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
11. Learning Representations from Foundation Models for Domain Generalized Stereo Matching
- Author
-
Zhang, Yongjian, Wang, Longguang, Li, Kunhong, Wang, Yun, Guo, Yulan, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
12. Temporal Event Stereo via Joint Learning with Stereoscopic Flow
- Author
-
Cho, Hoonhee, Kang, Jae-Young, Yoon, Kuk-Jin, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
13. Research on wave measurement and simulation experiments of binocular stereo vision based on intelligent feature matching.
- Author
-
Wu, Junjie, Chen, Shizhe, Liu, Shixuan, Song, Miaomiao, Wang, Bo, Zhang, Qingyang, Wu, Yushang, Lei, Zhuo, Zhang, Jiming, Yan, Xingkui, and Miao, Bin
- Subjects
BINOCULAR vision ,PYRAMIDS ,IMAGE registration ,OCEAN ,ALGORITHMS ,STEREO vision (Computer science) ,PROTOTYPES - Abstract
Waves are crucial in ocean observation and research. Stereo vision-based wave measurement, offering non-contact, low-cost, and intelligent processing, is an emerging method. However, improving accuracy remains a challenge due to wave complexity. This paper presents a novel approach to measure wave height, period, and direction by combining deep learning-based stereo matching with feature matching techniques. To improve the discontinuity and low accuracy in disparity maps from traditional wave image matching algorithms, this paper proposes the use of a high-precision stereo matching method based on Pyramid Stereo Matching Network (PSM-Net).A 3D reconstruction method integrating Scale-Invariant Feature Transform (SIFT) with stereo matching was also introduced to overcome the limitations of template matching and interleaved spectrum methods, which only provide 2D data and fail to capture the full 3D motion of waves. This approach enables accurate wave direction measurement. Additionally, a six-degree-of-freedom platform was proposed to simulate waves, addressing the high costs and attenuation issues of traditional wave tank simulations. Experimental results show the prototype system achieves a wave height accuracy within 5%, period accuracy within 4%, and direction accuracy of ±2°, proving the method's effectiveness and offering a new approach to stereo vision-based wave measurement. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
14. Estimation of Wind Turbine Blade Icing Volume Based on Binocular Vision.
- Author
-
Wei, Fangzheng, Guo, Zhiyong, Han, Qiaoli, and Qi, Wenkai
- Abstract
Icing on wind turbine blades in cold and humid weather has become a detrimental factor limiting their efficient operation, and traditional methods for detecting blade icing have various limitations. Therefore, this paper proposes a non-contact ice volume estimation method based on binocular vision and improved image processing algorithms. The method employs a stereo matching algorithm that combines dynamic windows, multi-feature fusion, and reordering, integrating gradient, color, and other information to generate matching costs. It utilizes a cross-based support region for cost aggregation and generates the final disparity map through a Winner-Take-All (WTA) strategy and multi-step optimization. Subsequently, combining image processing techniques and three-dimensional reconstruction methods, the geometric shape of the ice is modeled, and its volume is estimated using numerical integration methods. Experimental results on volume estimation show that for ice blocks with regular shapes, the errors between the measured and actual volumes are 5.28%, 8.35%, and 4.85%, respectively; for simulated icing on wind turbine blades, the errors are 5.06%, 6.45%, and 9.54%, respectively. The results indicate that the volume measurement errors under various conditions are all within 10%, meeting the experimental accuracy requirements for measuring the volume of ice accumulation on wind turbine blades. This method provides an accurate and efficient solution for detecting blade icing without the need to modify the blades, making it suitable for wind turbines already in operation. However, in practical applications, it may be necessary to consider the impact of illumination and environmental changes on visual measurements. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
15. Reliable and Effective Stereo Matching for Underwater Scenes.
- Author
-
Zhu, Lvwei, Gao, Ying, Zhang, Jiankai, Li, Yongqing, and Li, Xueying
- Subjects
- *
DEPTH perception , *OPTICAL distortion , *REMOTE submersibles , *IMAGE registration , *INTERPOLATION - Abstract
Stereo matching plays a vital role in underwater environments, where accurate depth estimation is crucial for applications such as robotics and marine exploration. However, underwater imaging presents significant challenges, including noise, blurriness, and optical distortions that hinder effective stereo matching. This study develops two specialized stereo matching networks: UWNet and its lightweight counterpart, Fast-UWNet. UWNet utilizes self- and cross-attention mechanisms alongside an adaptive 1D-2D cross-search to enhance cost volume representation and refine disparity estimation through a cascaded update module, effectively addressing underwater imaging challenges. Due to the need for timely responses in underwater operations by robots and other devices, real-time processing speed is critical for task completion. Fast-UWNet addresses this challenge by prioritizing efficiency, eliminating the reliance on the time-consuming recurrent updates commonly used in traditional methods. Instead, it directly converts the cost volume into a set of disparity candidates and their associated confidence scores. Adaptive interpolation, guided by content and confidence information, refines the cost volume to produce the final accurate disparity. This streamlined approach achieves an impressive inference speed of 0.02 s per image. Comprehensive tests conducted in diverse underwater settings demonstrate the effectiveness of both networks, showcasing their ability to achieve reliable depth perception. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Estimating the timber value of a forest property using geographically balanced samples and unoccupied aerial vehicle data.
- Author
-
Räty, Janne, Heikkinen, Juha, Kukkonen, Mikko, Mehtätalo, Lauri, Kangas, Annika, and Packalen, Petteri
- Subjects
POINT cloud ,STEREOPHONIC sound systems ,THREE-dimensional imaging ,VALUATION of real property ,TIMBER - Abstract
A common task in forestry is to determine the value of a forest property, and timber is the most valuable component of that property. Remotely sensed data collected by an unoccupied aerial vehicle (UAV) are suited for this purpose as most forest properties are of a size that permits the efficient collection of UAV data. These UAV data, when linked to a probability sample of field plots, enable the model-assisted (MA) estimation of the timber value and its associated uncertainty. Our objective was to estimate the value of timber (€/ha) in a 40-ha forest property in Finland. We used a systematic sample of field plots (n = 160) and 3D image point cloud data collected by an UAV. First, we studied the effects of spatial autocorrelation on the variance estimates associated with the timber value estimates produced using a field data-based simple expansion (EXP) estimator. The variance estimators compared were simple random sampling, Matérn, and a variant of the Grafström–Schelin estimator. Second, we compared the efficiencies of the EXP and MA estimators under different sampling intensities. The sampling intensity was varied by subsampling the systematic sample of 160 field plots. In the case of the EXP estimator, the simple random sampling variance estimator produced the largest variance estimates, whereas the Matérn estimator produced smaller variance estimates than the Grafström–Schelin estimator. The MA estimator was more efficient than the EXP estimator, which suggested that the reduction of sampling intensity from 160 to 60 plots is possible without deterioration in precision. The results suggest that the use of UAV data improves the precision of timber value estimates compared to the use of field data only. In practice, the proposed application improves the cost-efficiency of the design-based appraisal of a forest property because expensive field workload can be reduced by means of UAV data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Adaptive Kernel Convolutional Stereo Matching Recurrent Network.
- Author
-
Wang, Jiamian, Sun, Haijiang, and Jia, Ping
- Subjects
- *
IMAGE representation , *GENERALIZATION , *PIXELS , *COST , *IMAGE registration - Abstract
For binocular stereo matching techniques, the most advanced method currently is using an iterative structure based on GRUs. Methods in this class have shown high performance on both high-resolution images and standard benchmarks. However, simply replacing cost aggregation with a GRU iterative method leads to the original cost volume for disparity calculation lacking non-local geometric and contextual information. Based on this, this paper proposes a new GRU iteration-based adaptive kernel convolution deep recurrent network architecture for stereo matching. This paper proposes a kernel convolution-based adaptive multi-scale pyramid pooling (KAP) module that fully considers the spatial correlation between pixels and adds new matching attention (MAR) to refine the matching cost volume before inputting it into the iterative network for iterative updates, enhancing the pixel-level representation ability of the image and improving the overall generalization ability of the network. At present, the AKC-Stereo network proposed in this paper has a higher improvement than the basic network. On the Sceneflow dataset, the EPE of AKC-Stereo reaches 0.45, which is 0.02 higher than the basic network. On the KITTI 2015 dataset, the AKC-Stereo network outperforms the base network by 5.6% on the D1-all metric. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Research on a Method for Measuring the Pile Height of Materials in Agricultural Product Transport Vehicles Based on Binocular Vision.
- Author
-
Qian, Wang, Wang, Pengyong, Wang, Hongjie, Wu, Shuqin, Hao, Yang, Zhang, Xiaoou, Wang, Xinyu, Sun, Wenyan, Guo, Haijie, and Guo, Xin
- Subjects
- *
HEIGHT measurement , *AGRICULTURAL equipment , *AGRICULTURAL development , *AGRICULTURE , *FARM produce , *BINOCULAR vision - Abstract
The advancement of unloading technology in combine harvesting is crucial for the intelligent development of agricultural machinery. Accurately measuring material pile height in transport vehicles is essential, as uneven accumulation can lead to spillage and voids, reducing loading efficiency. Relying solely on manual observation for measuring stack height can decrease harvesting efficiency and pose safety risks due to driver distraction. This research applies binocular vision to agricultural harvesting, proposing a novel method that uses a stereo matching algorithm to measure material pile height during harvesting. By comparing distance measurements taken in both empty and loaded states, the method determines stack height. A linear regression model processes the stack height data, enhancing measurement accuracy. A binocular vision system was established, applying Zhang's calibration method on the MATLAB (R2019a) platform to correct camera parameters, achieving a calibration error of 0.15 pixels. The study implemented block matching (BM) and semi-global block matching (SGBM) algorithms using the OpenCV (4.8.1) library on the PyCharm (2020.3.5) platform for stereo matching, generating disparity, and pseudo-color maps. Three-dimensional coordinates of key points on the piled material were calculated to measure distances from the vehicle container bottom and material surface to the binocular camera, allowing for the calculation of material pile height. Furthermore, a linear regression model was applied to correct the data, enhancing the accuracy of the measured pile height. The results indicate that by employing binocular stereo vision and stereo matching algorithms, followed by linear regression, this method can accurately calculate material pile height. The average relative error for the BM algorithm was 3.70%, and for the SGBM algorithm, it was 3.35%, both within the acceptable precision range. While the SGBM algorithm was, on average, 46 ms slower than the BM algorithm, both maintained errors under 7% and computation times under 100 ms, meeting the real-time measurement requirements for combine harvesting. In practical operations, this method can effectively measure material pile height in transport vehicles. The choice of matching algorithm should consider container size, material properties, and the balance between measurement time, accuracy, and disparity map completeness. This approach aids in manual adjustment of machinery posture and provides data support for future autonomous master-slave collaborative operations in combine harvesting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. 基于双目视觉系统的幼龄格木生长因子测定.
- Author
-
王鹏, 王雪峰, and 赵溪月
- Abstract
To achieve non-destructive and efficient identification of juvenile trees, the binocular measurement method of sapling parameters was studied using 3-year-old juvenile Erythrophleum fordii trees as the research object. Side image pairs of 52 E. fordii were obtained using a RealSense D415 binocular camera. The images were preprocessed using gamma correction and limited contrast histogram equalization (CLAHE) algorithm. Then, the left and right views were matched using an improved semi-global block matching (SGBM) algorithm. The pixels were spatially mapped according to the triangulation principle to generate three-dimensional point cloud data. By extracting the coordinate information of the key points in the point cloud, the height, crown width, and ground diameter of E. fordii were measured. The results show that the image’ s contrast was significantly enhanced after preprocessing. The transition of different gray areas is smoother, which makes the disparity calculation more accurate and the matching accuracy improved. The improved SGBM algorithm had a better matching effect than the traditional SGBM algorithm. The obtained disparity map was smoother and the disparity continuity was stronger. The obtained point cloud density was higher and there was less noise. The measurement error from tree height, crown width, and ground diameter was smaller. The mean relative errors (Emr) were1. 981%, 2. 459%, and 2. 942%, respectively. The mean absolute errors (Ema) were 1.492 cm, 1.567 cm, and 0. 044 cm, respectively. The root mean square errors (Erms) were 1.843 cm, 1.914 cm, and 0. 060 cm, respectively. In general, the binocular measurement method based on the improved SGBM algorithm significantly improved the measurement accuracy of juvenile E. fordii parameters, providing technical support for the precise cultivation and management of juvenile trees. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. The Adversarial Robust and Generalizable Stereo Matching for Infrared Binocular Based on Deep Learning.
- Author
-
Liu, Bowen, Ji, Jiawei, Tao, Cancan, Li, Jujiu, and Wang, Yingxun
- Subjects
DEEP learning ,INFRARED cameras ,STEREO image ,INFRARED imaging ,IMAGE registration - Abstract
Despite the considerable success of deep learning methods in stereo matching for binocular images, the generalizability and robustness of these algorithms, particularly under challenging conditions such as occlusions or degraded infrared textures, remain uncertain. This paper presents a novel deep-learning-based depth optimization method that obviates the need for large infrared image datasets and adapts seamlessly to any specific infrared camera. Moreover, this adaptability extends to standard binocular images, allowing the method to work effectively on both infrared and visible light stereo images. We further investigate the role of infrared textures in a deep learning framework, demonstrating their continued utility for stereo matching even in complex lighting environments. To compute the matching cost volume, we apply the multi-scale census transform to the input stereo images. A stacked sand leak subnetwork is subsequently employed to address the matching task. Our approach substantially improves adversarial robustness while maintaining accuracy on comparison with state-of-the-art methods which decrease nearly a half in EPE for quantitative results on widely used autonomous driving datasets. Furthermore, the proposed method exhibits superior generalization capabilities, transitioning from simulated datasets to real-world datasets without the need for fine-tuning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. 基于四色结构光线图案的主动立体视觉匹配算法.
- Author
-
行文轩 and 王振洲
- Abstract
Copyright of Journal of Ordnance Equipment Engineering is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
22. Accelerated Reconstruction of Scenes Using CUDA-Based Parallel Computing
- Author
-
Gui Zou, Jin Jiang, and Qi Chen
- Subjects
Unmanned aerial vehicles (UAVs) ,stereo matching ,scene reconstruction ,CUDA optimization ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In recent years, perceiving the environment has been the most important part of unmanned aerial vehicles (UAVs) inspection tasks on the battlefield scene. This article aims to study how to quickly reconstruct unknown battlefield scenes, such as buildings, streets, etc. using UAVs equipped with binocular cameras and stereo-matching technology. We use a binocular module to reduce the cost of drone-borne equipment and lighten the load of the UAVs. Since the drone needs to use an embedded onboard computer and stereo matching algorithms require massive computing resources, this paper studies the optimization of CUDA-based stereo matching algorithms on embedded devices to improve the reconstruction effect of scenes and achieve rapid reconstruction, thereby reducing time costs, and enhancing drone efficiency.
- Published
- 2025
- Full Text
- View/download PDF
23. A Dual Branch Multiscale Stereo Matching Network for High-Resolution Satellite Remote Sensing Images
- Author
-
Zhenghui Xu, Yonghua Jiang, Jingxue Wang, and Yunming Wang
- Subjects
Disparity-channel attention ,dual branch feature extraction (DBFE) ,satellite remote sensing stereo images ,stereo matching ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Accurate disparity estimation of high-resolution satellite remote sensing stereo images serves as a crucial method for generating precise digital surface models. However, the complex intractable regions in satellite images (textureless regions, repeated texture regions, occlusion regions) pose serious challenges for accurate disparity estimation. To enhance the matching accuracy within intractable regions, a dual branch multiscale stereo matching network for high-resolution satellite stereo images is proposed. First, a dual branch feature extraction module is designed, which can perform efficient downsampling. This module can enhance the scene awareness capability of the model, enabling it to extract multiscale feature maps and construct multiscale cost volumes. Then, the cost aggregation process is executed in a coarse-to-fine manner. The method employs a simple hourglass structure and leverages low-scale information to guide the aggregation of high-scale cost volumes. Next, a disparity-channel attention mechanism is proposed for the cost aggregation process to obtain more representative feature information. Finally, a simple disparity refinement module is designed by utilizing both intensity and gradient information of the left image to improve the local details of the disparity map. Experiments are performed separately on the GaoFen-7 and US3D datasets. The experimental results indicate that the proposed method is conducive to improving the matching accuracy within intractable regions of satellite images. The structure of the proposed network is simple, which can effectively reduce the network parameters and realize the lightweight of the model.
- Published
- 2025
- Full Text
- View/download PDF
24. Exploring the Usage of Pre-trained Features for Stereo Matching.
- Author
-
Zhang, Jiawei, Huang, Lei, Bai, Xiao, Zheng, Jin, Gu, Lin, and Hancock, Edwin
- Subjects
- *
CONTINGENT employment , *SPINE , *NECK - Abstract
For many vision tasks, utilizing pre-trained features results in improved performance and consistently benefits from the rapid advancement of pre-training technologies. However, in the field of stereo matching, the use of pre-trained features has not been extensively researched. In this paper, we present the first systematical exploration into the utilization of pre-trained features for stereo matching. To provide flexible employment for any combination of pre-trained backbones and stereo matching networks, we develop the deformable neck (DN) that decouples the network architectures of these two components. The core idea of DN is to utilize the deformable attention mechanism to iteratively fuse pre-trained features from shallow to deep layers. Empirically, our exploration reveals the crucial factors that influence using pre-trained features for stereo matching. We further investigate the role of instance-level information of pre-trained features, demonstrating it benefits stereo matching while can be suppressed during convolution-based feature fusion. Built on the attention mechanism, the proposed DN module effectively utilizes the instance-level information in pre-trained features. Besides, we provide an understanding of the efficiency-accuracy tradeoff, concluding that using pre-trained features can also be a good alternative with efficiency consideration. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Multi-level pyramid fusion for efficient stereo matching.
- Author
-
Zhu, Jiaqi, Li, Bin, and Zhao, Xinhua
- Abstract
Stereo matching is a key technology for many autonomous driving and robotics applications. Recently, methods based on Convolutional Neural Network have achieved huge progress. However, it is still difficult to find accurate matching points in inherently ill-posed regions such as areas with weak texture and reflective surfaces. In this paper, we propose a multi-level pyramid fusion volume (MPFV-Stereo) which contains two prominent components: multi-scale cost volume (MSCV) and multi-level cost volume (MLCV). We also design a low-parameter Gaussian attention module to excite cost volume. Our MPFV-Stereo ranks 2nd on KITTI 2012 (Reflective) among all published methods. In addition, MPFV-Stereo has competitive results on both Scene Flow and KITTI datasets and requires less training to achieve strong cross-dataset generalization on Middlebury and ETH3D benchmark. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Multi-scale inputs and context-aware aggregation network for stereo matching.
- Author
-
Shi, Liqing, Xiong, Taiping, Cui, Gengshen, Pan, Minghua, Cheng, Nuo, and Wu, Xiangjie
- Subjects
BINOCULAR vision ,BLOCK designs ,ENCODING ,SPEED ,FORECASTING - Abstract
Despite the significant progress made in deep learning-based stereo matching, the accuracy of these methods significantly decreases when faced with challenges such as occlusions, reflections, textureless areas, and scale variations. In this paper, we propose MSCANet, a novel stereo matching network that integrates multi-scale inputs and context-aware aggregation ability. MSCANet effectively integrates rich multi-scale feature information and exhibits context-aware capability, thereby enabling it to achieve superior performance. Firstly, a multi-scale aware fusion module is designed to efficiently incorporate more comprehensive global context features at different scales, which allows the model to enhance its ability to generalize across images of varying scales. Secondly, a novel V-shaped encoder/decoder module is developed to effectively exploit the rich feature information. In the encoding stage, a 3D squeeze-and-excitation block is introduced to facilitate adaptively recalibration of learned feature maps. This block effectively suppresses irrelevant features while enhancing useful features, which improved efficiency and accuracy in disparity prediction. Additionally, a 3D context-aware decode block is designed to effectively utilize global context features to restore the original image structure during the decoding stage. Moreover, the high-level feature maps can be employed to augment low-level feature maps by incorporating more detailed information to avoid the side effects caused by the loss of information during the encoding process. Extensive ablation experiments and comparative experiments were conducted on Scene Flow dataset, KITTI2012 and KITTI2015 datasets to validate the effectiveness of each proposed module. The experimental results demonstrate MSCANet achieves competitive performance and offers a more straightforward and efficient model design, as well as faster inference speed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Matchability and Uncertainty-Aware Iterative Disparity Refinement for Stereo Matching.
- Author
-
Wang, Junwei, Zhou, Wei, Tang, Yujun, and Guo, Hanming
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,MEDICAL masks ,MUD - Abstract
After significant progress in stereo matching, the pursuit of robust and efficient ill-posed-region disparity refinement methods remains challenging. To further improve the performance of disparity refinement, in this paper, we propose the matchability and uncertainty-aware iterative disparity refinement neural network. Firstly, a new matchability and uncertainty decoder (MUD) is proposed to decode the matchability mask and disparity uncertainties, which are used to evaluate the reliability of feature matching and estimated disparity, thereby reducing the susceptibility to mismatched pixels. Then, based on the proposed MUD, we present two modules: the uncertainty-preferred disparity field initialization (UFI) and the masked hidden state global aggregation (MGA) modules. In the UFI, a multi-disparity window scan-and-select method is employed to provide a further initialized disparity field and more accurate initial disparity. In the MGA, the adaptive masked disparity field hidden state is globally aggregated to extend the propagation range per iteration, improving the refinement efficiency. Finally, the experimental results on public datasets show that the proposed model achieves a reduction up to 17.9% in disparity average error and 16.9% in occluded outlier proportion, respectively, demonstrating its more practical handling of ill-posed regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. 异形窗口快速立体匹配算法研究.
- Author
-
杨兴梅, 黄林海, 吴晓松, and 顾乃庭
- Subjects
PARALLAX ,COMPUTER vision ,COMPUTER simulation ,ALGORITHMS ,SPEED ,PIXELS - Abstract
Copyright of Journal of Chongqing University of Technology (Natural Science) is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
29. Load prediction of bucket elevator for jigging sorting based on binocular vision and Global Filter Networks.
- Author
-
Li, Jiawei, Dou, Dongyang, and Huang, Guodong
- Subjects
- *
BINOCULAR vision , *COAL preparation , *IMAGE segmentation , *POINT cloud , *ELEVATORS , *DEEP learning - Abstract
In jigging sorting, products are elevated by bucket elevators and transported via a shared belt. The product weight of each jig cannot be detected. The output of each product is the key parameter to calculate the jigging separation efficiency and measure the separation effect. At the same time, the bucket elevator overload will cause accidents, resulting in deformation and rupture of the bucket. To solve this problem, an intelligent monitoring scheme of bucket elevators based on binocular vision and deep learning was proposed. The YOLACT algorithm was utilized for coal and gangue image segmentation. Aiming at the problem of multi-layer accumulation of materials in the bucket elevator, a Global Filter Network (GFNet) based on the Pyramid Stereo Matching Network was used for more accurate stereo matching, and statistical filtering was used to denoise the generated 3D point cloud. The volume prediction model was established by integral. The Bootstrap method was used to obtain the best empirical density of coal and gangue. Finally, the volume parameters and density parameters were multiplied to get the mass of coal and gangue. The experimental results showed that the GFNet algorithm can obtain clear 3D point clouds. The average error of the volume calculation model based on these 3D point clouds was 4.19%. In the single-machine test of the coal preparation plant, the proposed system realized the real-time detection of the mass of materials carried by the bucket elevator. Compared with the electronic belt scale, the average error was 10.55%, which meets the industrial demand. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. 基于机器视觉的酿酒葡萄叶幕厚度估算方法研究.
- Author
-
马 聪 and 陈学东
- Abstract
The canopy thickness of grapevine is an important reference for vineyards management, shaping and pruning, variable spraying and other operations. However, monitoring the canopy thickness of grapevine has problems, such as high workload, low efficiency and high labor intensity. We focused on the canopy thickness of grapevine from the east Helan Mountain area, and used machine vision technology to study and validate a rapid estimation method for canopy thickness. We collected binocular images and parameters of canopy, extracted the canopy part of the left image based on HSV color space and morphological processing methods, corrected the extracted image with the right eye image, used BM algorithm to perform stereo matching on the corrected image, calculated the three-dimensional coordinates of all pixel points within the leaf canopy range, and calculated the estimated canopy thickness of grapevine based on the average z value and fixed parameters. After testing, the calculation results of the thickness estimation method proposed were basically consistent with the actual measurement values. It could provide a certain theoretical basis for the development of thickness measurement equipment and promoting the refinement of park management. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Zebrafishtracker3D: A 3D skeleton tracking algorithm for multiple zebrafish based on particle matching.
- Author
-
Fu, Zhenhua, Zhang, Shuhui, Zhou, Lu, Wang, Yiwen, Feng, Xizeng, Zhao, Xin, and Sun, Mingzhu
- Subjects
TRACKING algorithms ,BRACHYDANIO ,LIFE sciences ,BEHAVIORAL assessment ,SKELETON ,BONE spurs - Abstract
Zebrafish are considered as model organisms in biological and medical research because of their high degree of homology with human genes. Automatic behavioral analysis of multiple zebrafish based on visual tracking is expected to improve research efficiency. However, vision-based multi-object tracking algorithms often suffer from data loss owing to mutual occlusion. In addition, simply tracking zebrafish as points is not sufficient-more detailed information, which is required for research on zebrafish behavior. In this paper, we propose Zebrafishtracker3D, which utilizes a skeleton stability strategy to reduce detection error caused by frequent overlapping of multiple zebrafish effectively and estimates zebrafish skeletons using head coordinates in the top view. Further, we transform the front- and top-view matching task into an optimization problem and propose a particle-matching method to perform 3D tracking. The robustness of the algorithm with respect to occlusion is estimated on the dataset comprising two and three zebrafish. Experimental results demonstrate that the proposed algorithm exhibits a multiple object tracking accuracy (MOTA) exceeding 90% in the top view and a 3D tracking matching accuracy exceeding 90% in the complex videos with frequent overlapping. It is noteworthy that each instance in the trace saves its skeleton. In addition, Zebrafishtracker3D is applied in the zebrafish courtship experiment, establishes the stability of the method in applications of life science, and proves that the data can be used for behavioral analysis. Zebrafishtracker3D is the first algorithm that realizes 3D skeleton tracking of multiple zebrafish simultaneously. • The first 3D multi-object zebrafish skeleton-level tracking method is proposed. • It can effectively solve the problem of data loss caused by cross occlusion. • A general method of bone spur removal based on morphology is proposed. • The algorithm has been successfully applied to life science experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Research on Binocular Vision Image Calibration Method Based on Canny Operator.
- Author
-
Lei Yan
- Abstract
In this paper, on the basis of in-depth research on the key technology of binocular vision measurement; a set of multidimension online measurement system for image recognition is built. Canny operator is used as a tool to detect the contour features of parts, and the Canny operator is accelerated and improved from the aspects of mathematical reasoning and Gaussian pyramid. A synchronous external trigger circuit for a binocular camera and light source was designed. Finally, the improved algorithms in various aspects of visual measurement in this paper are applied to the measurement system. The experimental results show that the online measurement system has the advantages of high measurement accuracy and small repeatability errors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Stereo Matching Method with Cost Volume Collaborative Filtering.
- Author
-
Wu, Wenhuan, Xu, Xi, Wang, Wenshu, and Zhang, Haokun
- Subjects
STEREO image ,ERROR rates ,PIXELS ,AMBIGUITY ,ALGORITHMS - Abstract
Aiming at the problem of matching ambiguity and low disparity accuracy at the object boundary in stereo matching, a novel stereo matching algorithm with cost volume collaborative filtering is proposed. Firstly, for each pixel, two support windows are built, namely a local cross- support window as well as a global support window for the whole image. Secondly, a new adaptive weighted guide filter with a cross-support window as a kernel window is derived, and it is used to locally filter the cost volume. In addition, a minimum spanning tree is constructed in the whole image window, and then the minimum spanning tree filter is used to globally filter the cost volume. The collaborative filtering of cost volume is realized by fusing the filtering results of the local filter and global filter, so that each pixel can not only receive the support of the neighboring pixels in the local adaptive window, but can also receive the effective support of other pixels in the whole image, thus effectively eliminating the matching ambiguity in different texture regions while maintaining the disparity edges. The experimental results show that the average matching error rate of our method on the Middlebury stereo images is 3.17%. Compared with the other state-of-the-art methods, our method has higher robustness and matching accuracy, the generated disparity maps are smoother, and the disparity edges are better preserved. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Research on wave measurement and simulation experiments of binocular stereo vision based on intelligent feature matching
- Author
-
Junjie Wu, Shizhe Chen, Shixuan Liu, Miaomiao Song, Bo Wang, Qingyang Zhang, Yushang Wu, Zhuo Lei, Jiming Zhang, Xingkui Yan, and Bin Miao
- Subjects
stereo vision ,deep learning ,stereo matching ,feature matching ,wave parameter measurement ,wave height ,Science ,General. Including nature conservation, geographical distribution ,QH1-199.5 - Abstract
Waves are crucial in ocean observation and research. Stereo vision-based wave measurement, offering non-contact, low-cost, and intelligent processing, is an emerging method. However, improving accuracy remains a challenge due to wave complexity. This paper presents a novel approach to measure wave height, period, and direction by combining deep learning-based stereo matching with feature matching techniques. To improve the discontinuity and low accuracy in disparity maps from traditional wave image matching algorithms, this paper proposes the use of a high-precision stereo matching method based on Pyramid Stereo Matching Network (PSM-Net).A 3D reconstruction method integrating Scale-Invariant Feature Transform (SIFT) with stereo matching was also introduced to overcome the limitations of template matching and interleaved spectrum methods, which only provide 2D data and fail to capture the full 3D motion of waves. This approach enables accurate wave direction measurement. Additionally, a six-degree-of-freedom platform was proposed to simulate waves, addressing the high costs and attenuation issues of traditional wave tank simulations. Experimental results show the prototype system achieves a wave height accuracy within 5%, period accuracy within 4%, and direction accuracy of ±2°, proving the method’s effectiveness and offering a new approach to stereo vision-based wave measurement.
- Published
- 2024
- Full Text
- View/download PDF
35. A new stereo matching algorithm based on improved four-moded census transform and adaptive cross pyramid model
- Author
-
Zhongsheng Li, Jianchao Huang, Wencheng Wang, and Yucai Huang
- Subjects
stereo matching ,improved fct ,cross pyramid ,adaptive window ,Mathematics ,QA1-939 ,Applied mathematics. Quantitative methods ,T57-57.97 - Abstract
Stereo matching is still very challenging in terms of depth discontinuity, occlusions, weak texture regions, and noise resistance. To address the problems of poor noise immunity of local stereo matching and low matching accuracy in weak texture regions, a stereo matching algorithm (iFCTACP) based on improved four-moded census transform (iFCT) and a novel adaptive cross pyramid (ACP) structure were proposed. The algorithm combines the improved four-moded census transform matching cost with traditional measurement methods, which allows better anti-interference performance. The cost aggregation is performed on the adaptive cross pyramid structure, a unique structure that improves the traditional single mode of the cross. This structure not only enables regions with similar color and depth to be connected but also achieves cost smoothing across regions, significantly reducing the possibility of mismatch due to inadequate corresponding matching information and providing stronger robustness to weak texture regions. Experimental results show that the iFCTACP algorithm can effectively suppress noise interference, especially in illumination and exposure. Furthermore, it can markedly improve the error matching rate in weak texture regions with better generalization. Compared with some typical algorithms, the iFCTACP algorithm exhibits better performance whose average mismatching rate is only 3.33%.
- Published
- 2024
- Full Text
- View/download PDF
36. A stereo matching algorithm for coal mine underground images based on threshold and weight under Census transform
- Author
-
Chunyu YANG, Ziru SONG, and Xin ZHANG
- Subjects
driverless auxiliary transport vehicle ,binocular vision ,census transform ,salt-and-pepper noise ,stereo matching ,Mining engineering. Metallurgy ,TN1-997 - Abstract
Binocular image stereo matching is a key technology to realize autonomous obstacle avoidance and visual reconnaissance of unmanned auxiliary transport vehicles in coal mines. However, factors such as high dust and unstable lighting conditions in coal mines can lead to Salt-and-pepper noise in the images collected by the visual sensor, resulting in a high stereo matching error rate. Therefore, a Census stereo matching algorithm based on the combination of threshold and weight is proposed to reduce the impact of Salt-and-pepper noise on stereo matching. The main contributions include: ① threshold processing is carried out on the gray values of all pixels in the support window to remove the pixels with maximum and minimum gray values in the support window and solve the impact of outlier on the weighted fusion; ② the four diagonal pixels corresponding to the center point are weighted and fused to replace the center point pixel. Select pixel points along the four diagonal lines intersecting at the center pixel, with step sizes ranging from 1 to 3. According to the corresponding steps, weights of 0.7, 0.2, and 0.1 are assigned. Multiply the valid pixel points among these 12 points by their respective weights, then divide by the sum of the valid weights. This process yields the reference value of the center pixel point after weighted processing, addressing the issue of traditional algorithms' dependency on the center pixel of the Census transform window. Consequently, this approach enhances matching precision. The experimental results show that the average error rate calculated by the proposed algorithm is reduced by 5.64% compared to traditional Census algorithms, and reduced by 1.71% compared to the mean-based Census algorithm. What's more, the average error rate under different noise levels calculated by the proposed algorithm is reduced by 15.93% compared to the traditional Census algorithm, and reduced by 16.62% compared to the mean-based one. In non-occluded areas, the error matching rate of our algorithm is reduced by 17.19% compared to the traditional Census algorithm and 18.11% compared to the mean-based Census algorithm. The proposed Census stereo matching algorithm, which combines threshold and weight, effectively enhances the robustness against noise, reduces the error rate, and improves matching accuracy.
- Published
- 2024
- Full Text
- View/download PDF
37. Correction Compensation and Adaptive Cost Aggregation for Deep Laparoscopic Stereo Matching.
- Author
-
Zhang, Jian, Yang, Bo, Zhao, Xuanchi, and Shi, Yi
- Subjects
PLASTIC surgery ,IMAGE registration ,FOCAL length ,THREE-dimensional imaging ,DEPTH perception ,SURGICAL robots - Abstract
Featured Application: This work explores the computation of disparity in stereoscopic laparoscopic images through stereo matching algorithms. By integrating the focal length and baseline of the laparoscopic vision system, we can transform the disparity into depth measurements. This digitized depth information facilitates the three-dimensional reconstruction of surgical scenes, and the real-time three-dimensional reconstructed images have the potential to provide supplementary guidance information to surgeons during procedures, thereby reducing surgical risks. Additionally, by leveraging this known digitized depth information, surgical robots can synchronize their movements with beating organs, thus reducing the complexity of such surgeries. Perception of digitized depth is a prerequisite for enabling the intelligence of three-dimensional (3D) laparoscopic systems. In this context, stereo matching of laparoscopic stereoscopic images presents a promising solution. However, the current research in this field still faces challenges. First, the acquisition of accurate depth labels in a laparoscopic environment proves to be a difficult task. Second, errors in the correction of laparoscopic images are prevalent. Finally, laparoscopic image registration suffers from ill-posed regions such as specular highlights and textureless areas. In this paper, we make significant contributions by developing (1) a correction compensation module to overcome correction errors; (2) an adaptive cost aggregation module to improve prediction performance in ill-posed regions; (3) a novel self-supervised stereo matching framework based on these two modules. Specifically, our framework rectifies features and images based on learned pixel offsets, and performs differentiated aggregation on cost volumes based on their value. The experimental results demonstrate the effectiveness of the proposed modules. On the SCARED dataset, our model reduces the mean depth error by 12.6% compared to the baseline model and outperforms the state-of-the-art unsupervised methods and well-generalized models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. A new stereo matching algorithm based on improved four-moded census transform and adaptive cross pyramid model.
- Author
-
Li, Zhongsheng, Huang, Jianchao, Wang, Wencheng, and Huang, Yucai
- Subjects
- *
ALGORITHMS , *COMPUTER vision , *GENERALIZATION , *COST estimates , *INTERFERENCE (Sound) - Abstract
Stereo matching is still very challenging in terms of depth discontinuity, occlusions, weak texture regions, and noise resistance. To address the problems of poor noise immunity of local stereo matching and low matching accuracy in weak texture regions, a stereo matching algorithm (iFCTACP) based on improved four-moded census transform (iFCT) and a novel adaptive cross pyramid (ACP) structure were proposed. The algorithm combines the improved four-moded census transform matching cost with traditional measurement methods, which allows better anti-interference performance. The cost aggregation is performed on the adaptive cross pyramid structure, a unique structure that improves the traditional single mode of the cross. This structure not only enables regions with similar color and depth to be connected but also achieves cost smoothing across regions, significantly reducing the possibility of mismatch due to inadequate corresponding matching information and providing stronger robustness to weak texture regions. Experimental results show that the iFCTACP algorithm can effectively suppress noise interference, especially in illumination and exposure. Furthermore, it can markedly improve the error matching rate in weak texture regions with better generalization. Compared with some typical algorithms, the iFCTACP algorithm exhibits better performance whose average mismatching rate is only 3.33 %. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Guided aggregation and disparity refinement for real-time stereo matching.
- Author
-
Yang, Jinlong, Wu, Cheng, Wang, Gang, and Chen, Dong
- Abstract
Stereo matching methods based on convolution neural network (CNN) often face challenges such as edge blurring and the loss of small structures. These issues often result in incorrect disparity assignments when upsampling the disparity map. To address this problem, we propose a disparity refinement module (GDU-CTF) that combines guided disparity map upsampling with a coarse-to-fine process. This approach effectively restores incorrect disparity values in the final disparity map. Furthermore, due to the insufficient aggregation of global geometric and contextual texture features using basic encoder–decoder 3D convolutional networks, we propose a guided patch cost aggregation module (GPA) that generates a more precise initial disparity map for textureless areas. These modules complement each other and are efficient, resulting in an accurate and lightweight framework for stereo matching. Experimental results demonstrate that our algorithm has excellent accuracy in generating disparity maps and achieves outstanding real-time performance, with an inference time of just 0.03 s on Scene Flow and KITTI datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Structured support vector machine with coarse-to-fine PatchMatch filtering for stereo matching.
- Author
-
Yao, Peng, Sang, Haiwei, and Cheng, Xu
- Subjects
- *
SUPPORT vector machines , *MACHINE learning , *FEATURE extraction , *STEREO vision (Computer science) , *OBJECT tracking (Computer vision) - Abstract
In the past decades, a variety of learning-based algorithms have been emerged to try to explore a better solution for stereo matching by leveraging various machine learning algorithms. For enriching learning-based stereo matching algorithm's methodologies, we cast the disparity estimation as a regression problem by leveraging Structured Support Vector Machine (SSVM) in this paper. There are three categories of features have been extracted on account of disparity cues for training the SSVM. Particularly, one of the three feature is named as 'Coarse-to-Fine PatchMatch Filtering', which effectively exploits region and pixel disparity cues. For attaining region disparity cues, we adopt MeshStereo and MeshStereo with Cross-Scale algorithms; for attaining pixel disparity cues, PatchMatch and Cross-Scale PatchMatch stereo matching algorithms are utilized. Performance evaluations on Middlebury v.2 and v.3 stereo data sets demonstrate that the proposed algorithm reveals comparable accuracy with other challenging learning-based ones. It is worth pointing out that our proposal performs over several orders of magnitude faster than others on training time. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. A Binocular Color Line-Scanning Stereo Vision System for Heavy Rail Surface Detection and Correction Method of Motion Distortion.
- Author
-
Wang, Chao, Luo, Weixi, Niu, Menghui, Li, Jiqiang, and Song, Kechen
- Subjects
BINOCULAR vision ,VISUAL fields ,KALMAN filtering ,LIGHT sources ,THREE-dimensional imaging ,INDUSTRIAL goods - Abstract
Thanks to the line-scanning camera, the measurement method based on line-scanning stereo vision has high optical accuracy, data transmission efficiency, and a wide field of vision. It is more suitable for continuous operation and high-speed transmission of industrial product detection sites. However, the one-dimensional imaging characteristics of the line-scanning camera cause motion distortion during image data acquisition, which directly affects the accuracy of detection. Effectively reducing the influence of motion distortion is the primary problem to ensure detection accuracy. To obtain the two-dimensional color image and three-dimensional contour data of the heavy rail surface at the same time, a binocular color line-scanning stereo vision system is designed to collect the heavy rail surface data combined with the bright field illumination of the symmetrical linear light source. Aiming at the image motion distortion caused by system installation error and collaborative acquisition frame rate mismatch, this paper uses the checkerboard target and two-step cubature Kalman filter algorithm to solve the nonlinear parameters in the motion distortion model, estimate the real motion, and correct the image information. The experiments show that the accuracy of the data contained in the image is improved by 57.3% after correction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Refining disparity maps using deep learning and edge-aware smoothing filter.
- Author
-
Gani, Shamsul Fakhar Abd, Miskon, Muhammad Fahmi, Hamzah, Rostam Affendi, Hamid, Mohd Saad, Kadmin, Ahmad Fauzan, and Herman, Adi Irwan
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,POINT cloud ,DEPTH perception ,SURFACE reconstruction ,MAPS ,STEREO image - Abstract
Stereo matching algorithm is crucial for applications that rely on threedimensional (3D) surface reconstruction, producing a disparity map that contains depth information by computing the disparity values between corresponding points from a stereo image pair. In order to yield desirable results, the proposed stereo matching algorithm must possess a high degree of resilience against radiometric variation and edge inconsistencies. In this article convolutional neural network (CNN) is employed in the first stage to generate the raw matching cost, which is subsequently filtered with a bilateral filter (BF) and applied with cross-based cost aggregation (CBCA) during the cost aggregation stage to enhance precision. Winner-take-all (WTA) strategy is implemented to normalise the disparity map values. Finally, the resulting output is subjected to an edge-aware smoothing filter (EASF) to reduce the noise. Due to its resistance to high contrast and brightness, the filter is found to be effective in refining and eliminating noise from the output image. Despite discontinuities like adiron's lost cup handle or artl's shattered rods, this approach, based on experimental research utilizing a Middlebury standard validation benchmark, yields a high level of accuracy, with an average non-occluded error of 6.79%, comparable to other published methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Stereo matching algorithm using deep learning and edge-preserving filter for machine vision.
- Author
-
Abd Gani, Shamsul Fakhar, Miskon, Muhammad Fahmi, Hamzah, Rostam Affendi, Hamid, Mohd Saad, Kadmin, Ahmad Fauzan, and Herman, Adi Irwan
- Subjects
COMPUTER vision ,DEEP learning ,CONVOLUTIONAL neural networks ,POINT cloud ,STEREOSCOPIC cameras ,SURFACE reconstruction - Abstract
Machine vision research began with a single-camera system, but these systems had various limitations from having just one point-of-view of the environment and no depth information, therefore stereo cameras were invented. This paper proposes a hybrid method of a stereo matching algorithm with the goal of generating an accurate disparity map critical for applications such as 3D surface reconstruction and robot navigation to name a few. Convolutional neural network (CNN) is utilised to generate the matching cost, which is then input into cost aggregation to increase accuracy with the help of a bilateral filter (BF). Winner-take-all (WTA) is used to generate the preliminary disparity map. An edge-preserving filter (EPF) is applied to that output based on a transform that defines an isometry between curves on the 2D image manifold in 5D and the real line to eliminate these artefacts. The transform warps the input signal adaptively to allow linear 1D filtering. Due to the filter's resistance to high contrast and brightness, it is effective in refining and removing noise from the output image. Based on experimental research employing a Middlebury standard validation benchmark, this approach gives high accuracy with an average non-occluded error of 6.71% comparable to other published methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Research on 3D virtual vision matching based on interactive color segmentation.
- Author
-
Yahui Wang, Haiwen Wang, Juan Jin, and Yingfeng Kuang
- Subjects
IMAGE segmentation ,BINOCULAR vision ,STEREO vision (Computer science) ,TRANSFORMER models ,ERROR rates ,ALGORITHMS - Abstract
Given the prevalent issues surrounding accuracy and efficiency in contemporary stereo matching algorithms, this research introduces an innovative image segmentation based approach. The proposed methodology integrates residual and Swim Transformer modules into the established 3D Unet framework, yielding the Res-Swim-UNet image segmentation model. The algorithm estimates the disparateness of segmented outputs by employing regression techniques, culminating in a comprehensive disparity map. Experimental findings underscore the superiority of the proposed algorithm across all evaluated metrics. Specifically, the proposed network demonstrates marked improvements, with IoU and mPA enhancements of 2.9% and 162%, respectively. Notably, the average matching error rate of the algorithm registers at 2.02%, underscoring its efficacy in achieving precise stereoscopic matching. Moreover, the model’s enhanced generalization capability and robustness underscore its potential for widespread applicability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Deep-Learning-Based Stereo Matching of Long-Distance Sea Surface Images for Sea Level Monitoring Systems.
- Author
-
Yang, Ying, Lu, Cunwei, and Li, Zhenhua
- Subjects
OCEAN waves ,IMAGE analysis ,STEREOPHONIC sound systems - Abstract
Due to the advantages of coastal areas in the fields of agriculture, transport, and fishing, increasingly more people are moving to these areas. Sea level information is important for these people to survive after extreme sea level events. With the recent improvements in computing and storage capacities, image analysis as a new measuring method is being rapidly developed and widely applied. In this paper, a multi-camera-based sea level height measuring system was built along Japan's coast and a deep-learning-based stereo matching method has been proposed for this system to complete 3D measurements. In this system, cameras are set with long base distances to ensure the long-distance monitoring system's precision, which causes a huge difference between the fields of view of the left and right cameras. Since most common network structures complete stereo matching by depth-wise cross-correlation between left and right images, they rely too much on the high-quality rectification of two images and fail on our long-distance sea surface images. We established a feature detection and matching network to realize sea wave extraction and sparse stereo matching for the system. Based on our previous result using the traditional method, the initial disparity was computed to reduce the search range of stereo matching. A training set with 785 pairs of sea surface images and 10,172 pairs of well-matched sea wave images was constructed to supervise the network. The experimental results verified that the proposed method can realize sea wave extraction and mask generation. It can also realize sparse matching of sea surface images regardless of poor rectification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. 基于阈值和权重 Census 变换的煤矿井下 图像立体匹配算法.
- Author
-
杨春雨, 宋子儒, and 张鑫
- Abstract
Copyright of Coal Science & Technology (0253-2336) is the property of Coal Science & Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
47. Lightweight and Error-Tolerant Stereo Matching with a Stochastic Computing Processor.
- Author
-
An, Seongmo, Oh, Jongwon, Lee, Sangho, Kim, Jinyeol, Jeong, Youngwoo, Kim, Jeongeun, and Lee, Seung Eun
- Subjects
PROBABILITY theory - Abstract
Stereo matching, utilized in diverse fields, poses a challenge to systems in resource-constrained environments due to the significant growth of computational load with image resolution. The challenge is crucial for the systems because fields utilizing stereo matching require short operational time for real-time applications and low power architecture. Stochastic computing (SC) is able to be a valuable approach to address the challenge by reducing the computational load by representing binary numbers with stochastic sequences, which are encoded as a probability value, and by leveraging the concept of mathematical probability. Also, it is possible for a system to be error-tolerant by utilizing the characteristics of stochastic computing. Therefore, in this paper, we propose an approach for lightweight and error-tolerant stereo matching with a hardware-implemented stochastic computing processor. To verify the feasibility and error tolerance of the proposed system, we implemented the proposed system and conducted experiments comparing depth maps with or without stochastic computing by calculating similarities. According to the experimental results, the proposed system indicated no significant differences in output depth maps and achieved an improvement in the depth maps from error-injected input images by an average of 58.95%. Therefore, we demonstrated that stereo matching with stochastic computing is feasible and error-tolerant. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. 基于双目视觉的部分遮挡行人检测算法.
- Author
-
刘城逍, 何涛, and 景嘉宝
- Abstract
An algorithm for detecting partially occluded pedestrians based on multi-feature fusion and tree-structured semi-global stereo matching was proposed to address the issue of reduced detection accuracy in pedestrian detection caused by partial obstruction. The simple linear iterative clustering (SLIC) algorithm was employed for superpixel segmentation to enhance the contour information of pedestrians, and the tree-structured multi-feature fusion semi-global stereo matching algorithm was used to generate depth maps. Pedestrian, background, and obstacle information were separated using an adaptive segmentation algorithm to obtain the region of interest. The region of interest was positioned around the head and shoulders of the pedestrian, where features were distinct and stable, to impose constraints. Feature extraction was conducted using dimension-reduced histogram of gradient (HOG), and a sample set was generated for training an support vector machines(SVM) classifier, ultimately achieving the detection of partially occluded pedestrians. The experiment shows that compared with other pedestrian detection algorithms, the proposed algorithm has a higher accuracy in pedestrian detection in partially occluded scenes, proving the effectiveness of the proposed algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Few-Shot Stereo Matching with High Domain Adaptability Based on Adaptive Recursive Network.
- Author
-
Wu, Rongcheng, Wang, Mingzhe, Li, Zhidong, Zhou, Jianlong, Chen, Fang, Wang, Xuan, and Sun, Changming
- Subjects
- *
CONVOLUTIONAL neural networks , *ROBOT vision , *DEEP learning , *AUTONOMOUS vehicles , *AUTONOMOUS robots , *SUPERVISED learning - Abstract
Deep learning based stereo matching algorithms have been extensively researched in areas such as robot vision and autonomous driving due to their promising performance. However, these algorithms require a large amount of labeled data for training and encounter inadequate domain adaptability, which degraded their applicability and flexibility. This work addresses the two deficiencies and proposes a few-shot trained stereo matching model with high domain adaptability. In the model, stereo matching is formulated as the problem of dynamic optimization in the possible solution space, and a multi-scale matching cost computation method is proposed to obtain the possible solution space for the application scenes. Moreover, an adaptive recurrent 3D convolutional neural network is designed to determine the optimal solution from the possible solution space. Experimental results demonstrate that the proposed model outperforms the state-of-the-art stereo matching algorithms in terms of training requirements and domain adaptability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. 面向桌面交互场景的双目深度测量方法.
- Author
-
叶彬, 朱兴帅, 姚康, 丁上上, and 付威威
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.