449 results
Search Results
2. Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer.
- Author
-
Wang, Minhui, Sun, Yaxiu, Xiang, Jianhong, Sun, Rui, and Zhong, Yu
- Subjects
- *
TRANSFORMER models , *CONVOLUTIONAL neural networks , *LIDAR , *DIGITAL elevation models , *TRANSFER matrix , *DATA fusion (Statistics) - Abstract
Utilizing multi-modal data, as opposed to only hyperspectral image (HSI), enhances target identification accuracy in remote sensing. Transformers are applied to multi-modal data classification for their long-range dependency but often overlook intrinsic image structure by directly flattening image blocks into vectors. Moreover, as the encoder deepens, unprofitable information negatively impacts classification performance. Therefore, this paper proposes a learnable transformer with an adaptive gating mechanism (AGMLT). Firstly, a spectral–spatial adaptive gating mechanism (SSAGM) is designed to comprehensively extract the local information from images. It mainly contains point depthwise attention (PDWA) and asymmetric depthwise attention (ADWA). The former is for extracting spectral information of HSI, and the latter is for extracting spatial information of HSI and elevation information of LiDAR-derived rasterized digital surface models (LiDAR-DSM). By omitting linear layers, local continuity is maintained. Then, the layer Scale and learnable transition matrix are introduced to the original transformer encoder and self-attention to form the learnable transformer (L-Former). It improves data dynamics and prevents performance degradation as the encoder deepens. Subsequently, learnable cross-attention (LC-Attention) with the learnable transfer matrix is designed to augment the fusion of multi-modal data by enriching feature information. Finally, poly loss, known for its adaptability with multi-modal data, is employed in training the model. Experiments in the paper are conducted on four famous multi-modal datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and Houston2013 (HU). The results show that AGMLT achieves optimal performance over some existing models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Vulnerable Road User Skeletal Pose Estimation Using mmWave Radars.
- Author
-
Zeng, Zhiyuan, Liang, Xingdong, Li, Yanlei, and Dang, Xiangwei
- Subjects
- *
ROAD users , *TRACKING radar , *RADAR targets , *CONVOLUTIONAL neural networks , *RADAR signal processing , *DATA augmentation - Abstract
A skeletal pose estimation method, named RVRU-Pose, is proposed to estimate the skeletal pose of vulnerable road users based on distributed non-coherent mmWave radar. In view of the limitation that existing methods for skeletal pose estimation are only applicable to small scenes, this paper proposes a strategy that combines radar intensity heatmaps and coordinate heatmaps as input to a deep learning network. In addition, we design a multi-resolution data augmentation and training method suitable for radar to achieve target pose estimation for remote and multi-target application scenarios. Experimental results show that RVRU-Pose can achieve better than 2 cm average localization accuracy for different subjects in different scenarios, which is superior in terms of accuracy and time compared to existing state-of-the-art methods for human skeletal pose estimation with radar. As an essential performance parameter of radar, the impact of angular resolution on the estimation accuracy of a skeletal pose is quantitatively analyzed and evaluated in this paper. Finally, RVRU-Pose has also been extended to the task of estimating the skeletal pose of a cyclist, reflecting the strong scalability of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. LDnADMM-Net: A Denoising Unfolded Deep Neural Network for Direction-of-Arrival Estimations in A Low Signal-to-Noise Ratio.
- Author
-
Liang, Can, Liu, Mingxuan, Li, Yang, Wang, Yanhua, and Hu, Xueyao
- Subjects
- *
DIRECTION of arrival estimation , *CONVOLUTIONAL neural networks , *SIGNAL-to-noise ratio , *COMPRESSED sensing , *SIGNAL denoising - Abstract
In this paper, we explore the problem of direction-of-arrival (DOA) estimation for a non-uniform linear array (NULA) under strong noise. The compressed sensing (CS)-based methods are widely used in NULA DOA estimations. However, these methods commonly rely on the tuning of parameters, which are hard to fine-tune. Additionally, these methods lack robustness under strong noise. To address these issues, this paper proposes a novel DOA estimation approach using a deep neural network (DNN) for a NULA in a low SNR. The proposed network is designed based on the denoising convolutional neural network (DnCNN) and the alternating direction method of multipliers (ADMM), which is dubbed as LDnADMM-Net. First, we construct an unfolded DNN architecture that mimics the behavior of the iterative processing of an ADMM. In this way, the parameters of an ADMM can be transformed into the network weights, and thus we can adaptively optimize these parameters through network training. Then, we employ the DnCNN to develop a denoising module (DnM) and integrate it into the unfolded DNN. Using this DnM, we can enhance the anti-noise ability of the proposed network and obtain a robust DOA estimation in a low SNR. The simulation and experimental results show that the proposed LDnADMM-Net can obtain high-accuracy and super-resolution DOA estimations for a NULA with strong robustness in a low signal-to-noise ratio (SNR). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Target Detection Method for High-Frequency Surface Wave Radar RD Spectrum Based on (VI)CFAR-CNN and Dual-Detection Maps Fusion Compensation.
- Author
-
Ji, Yuanzheng, Liu, Aijun, Chen, Xuekun, Wang, Jiaqi, and Yu, Changjun
- Subjects
- *
CONVOLUTIONAL neural networks , *TRACKING algorithms , *AUTOMATIC identification - Abstract
This paper proposes a method for the intelligent detection of high-frequency surface wave radar (HFSWR) targets. This method cascades the adaptive constant false alarm (CFAR) detector variability index (VI) with the convolutional neural network (CNN) to form a cascade detector (VI)CFAR-CNN. First, the (VI)CFAR algorithm is used for the first-level detection of the range–Doppler (RD) spectrum; based on this result, the two-dimensional window slice data are extracted using the window with the position of the target on the RD spectrum as the center, and input into the CNN model to carry out further target and clutter identification. When the detection rate of the detector reaches a certain level and cannot be further improved due to the convergence of the CNN model, this paper uses a dual-detection maps fusion method to compensate for the loss of detection performance. First, the optimized parameters are used to perform the weighted fusion of the dual-detection maps, and then, the connected components in the fused detection map are further processed to achieve an independent (VI)CFAR to compensate for the (VI)CFAR-CNN detection results. Due to the difficulty in obtaining HFSWR data that include comprehensive and accurate target truth values, this paper adopts a method of embedding targets into the measured background to construct the RD spectrum dataset for HFSWR. At the same time, the proposed method is compared with various other methods to demonstrate its superiority. Additionally, a small amount of automatic identification system (AIS) and radar correlation data are used to verify the effectiveness and feasibility of this method on completely measured HFSWR data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Combining "Deep Learning" and Physically Constrained Neural Networks to Derive Complex Glaciological Change Processes from Modern High-Resolution Satellite Imagery: Application of the GEOCLASS-Image System to Create VarioCNN for Glacier Surges.
- Author
-
Herzfeld, Ute C., Hessburg, Lawrence J., Trantow, Thomas M., and Hayes, Adam N.
- Abstract
The objectives of this paper are to investigate the trade-offs between a physically constrained neural network and a deep, convolutional neural network and to design a combined ML approach ("VarioCNN"). Our solution is provided in the framework of a cyberinfrastructure that includes a newly designed ML software, GEOCLASS-image (v1.0), modern high-resolution satellite image data sets (Maxar WorldView data), and instructions/descriptions that may facilitate solving similar spatial classification problems. Combining the advantages of the physically-driven connectionist-geostatistical classification method with those of an efficient CNN, VarioCNN provides a means for rapid and efficient extraction of complex geophysical information from submeter resolution satellite imagery. A retraining loop overcomes the difficulties of creating a labeled training data set. Computational analyses and developments are centered on a specific, but generalizable, geophysical problem: The classification of crevasse types that form during the surge of a glacier system. A surge is a glacial catastrophe, an acceleration of a glacier to typically 100–200 times its normal velocity. GEOCLASS-image is applied to study the current (2016-2024) surge in the Negribreen Glacier System, Svalbard. The geophysical result is a description of the structural evolution and expansion of the surge, based on crevasse types that capture ice deformation in six simplified classes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. SSAformer: Spatial–Spectral Aggregation Transformer for Hyperspectral Image Super-Resolution.
- Author
-
Wang, Haoqian, Zhang, Qi, Peng, Tao, Xu, Zhongjie, Cheng, Xiangai, Xing, Zhongyang, and Li, Teng
- Subjects
- *
TRANSFORMER models , *HIGH resolution imaging , *CONVOLUTIONAL neural networks , *REMOTE sensing , *ENVIRONMENTAL monitoring , *SPECTRAL imaging , *IMAGE reconstruction algorithms - Abstract
The hyperspectral image (HSI) distinguishes itself in material identification through its exceptional spectral resolution. However, its spatial resolution is constrained by hardware limitations, prompting the evolution of HSI super-resolution (SR) techniques. Single HSI SR endeavors to reconstruct high-spatial-resolution HSI from low-spatial-resolution inputs, and recent progress in deep learning-based algorithms has significantly advanced the quality of reconstructed images. However, convolutional methods struggle to extract comprehensive spatial and spectral features. Transformer-based models have yet to harness long-range dependencies across both dimensions fully, thus inadequately integrating spatial and spectral data. To solve the above problem, in this paper, we propose a new HSI SR method, SSAformer, which merges the strengths of CNNs and Transformers. It introduces specially designed attention mechanisms for HSI, including spatial and spectral attention modules, and overcomes the previous challenges in extracting and amalgamating spatial and spectral information. Evaluations on benchmark datasets show that SSAformer surpasses contemporary methods in enhancing spatial details and preserving spectral accuracy, underscoring its potential to expand HSI's utility in various domains, such as environmental monitoring and remote sensing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Changes in the Water Area of an Inland River Terminal Lake (Taitma Lake) Driven by Climate Change and Human Activities, 2017–2022.
- Author
-
Zi, Feng, Wang, Yong, Lu, Shanlong, Ikhumhen, Harrison Odion, Fang, Chun, Li, Xinru, Wang, Nan, and Kuang, Xinya
- Subjects
- *
ENDORHEIC lakes , *WATER resources development , *CONVOLUTIONAL neural networks , *LAKES , *DEEP learning , *CLIMATE change - Abstract
Constructed from a dataset capturing the seasonal and annual water body distribution of the lower Qarqan River in the Taitma Lake area from 2017 to 2022, and combined with the meteorological and hydraulic engineering data, the spatial and temporal change patterns of the Taitma Lake watershed area were determined. Analyses were conducted using Planetscope (PS) satellite images and a deep learning model. The results revealed the following: ① Deep learning-based water body extraction provides significantly greater accuracy than the conventional water body index approach. With an impressive accuracy of up to 96.0%, UPerNet was found to provide the most effective extraction results among the three convolutional neural networks (U-Net, DeeplabV3+, and UPerNet) used for semantic segmentation; ② Between 2017 and 2022, Taitma Lake's water area experienced a rapid decrease, with the distribution of water predominantly shifting towards the east–west direction more than the north–south. The shifts between 2017 and 2020 and between 2020 and 2022 were clearly discernible, with the latter stage (2020–2022) being more significant than the former (2017–2020); ③ According to observations, Taitma Lake's changing water area has been primarily influenced by human activity over the last six years. Based on the research findings of this paper, it was observed that this study provides a valuable scientific basis for water resource allocation aiming to balance the development of water resources in the middle and upper reaches of the Tarim and Qarqan Rivers, as well as for the ecological protection of the downstream Taitma Lake. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. MEA-EFFormer: Multiscale Efficient Attention with Enhanced Feature Transformer for Hyperspectral Image Classification.
- Author
-
Sun, Qian, Zhao, Guangrui, Fang, Yu, Fang, Chenrong, Sun, Le, and Li, Xingying
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *DEEP learning , *TRANSFORMER models , *FEATURE extraction - Abstract
Hyperspectral image classification (HSIC) has garnered increasing attention among researchers. While classical networks like convolution neural networks (CNNs) have achieved satisfactory results with the advent of deep learning, they are confined to processing local information. Vision transformers, despite being effective at establishing long-distance dependencies, face challenges in extracting high-representation features for high-dimensional images. In this paper, we present the multiscale efficient attention with enhanced feature transformer (MEA-EFFormer), which is designed for the efficient extraction of spectral–spatial features, leading to effective classification. MEA-EFFormer employs a multiscale efficient attention feature extraction module to initially extract 3D convolution features and applies effective channel attention to refine spectral information. Following this, 2D convolution features are extracted and integrated with local binary pattern (LBP) spatial information to augment their representation. Then, the processed features are fed into a spectral–spatial enhancement attention (SSEA) module that facilitates interactive enhancement of spectral–spatial information across the three dimensions. Finally, these features undergo classification through a transformer encoder. We evaluate MEA-EFFormer against several state-of-the-art methods on three datasets and demonstrate its outstanding HSIC performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Locating and Grading of Lidar-Observed Aircraft Wake Vortex Based on Convolutional Neural Networks.
- Author
-
Zhang, Xinyu, Zhang, Hongwei, Wang, Qichao, Liu, Xiaoying, Liu, Shouxin, Zhang, Rongchuan, Li, Rongzhong, and Wu, Songhua
- Subjects
- *
CONVOLUTIONAL neural networks , *DOPPLER lidar , *AERONAUTICAL safety measures - Abstract
Aircraft wake vortices are serious threats to aviation safety. The Pulsed Coherent Doppler Lidar (PCDL) has been widely used in the observation of aircraft wake vortices due to its advantages of high spatial-temporal resolution and high precision. However, the post-processing algorithms require significant computing resources, which cannot achieve the real-time detection of a wake vortex (WV). This paper presents an improved Convolutional Neural Network (CNN) method for WV locating and grading based on PCDL data to avoid the influence of unstable ambient wind fields on the localization and classification results of WV. Typical WV cases are selected for analysis, and the WV locating and grading models are validated on different test sets. The consistency of the analytical algorithm and the CNN algorithm is verified. The results indicate that the improved CNN method achieves satisfactory recognition accuracy with higher efficiency and better robustness, especially in the case of strong turbulence, where the CNN method recognizes the wake vortex while the analytical method cannot. The improved CNN method is expected to be applied to optimize the current aircraft spacing criteria, which is promising in terms of aviation safety and economic benefit improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Object-Based Semi-Supervised Spatial Attention Residual UNet for Urban High-Resolution Remote Sensing Image Classification.
- Author
-
Lu, Yuanbing, Li, Huapeng, Zhang, Ce, and Zhang, Shuqing
- Subjects
- *
CONVOLUTIONAL neural networks , *DISTRIBUTION (Probability theory) , *WILCOXON signed-rank test , *DEEP learning , *LAND cover - Abstract
Accurate urban land cover information is crucial for effective urban planning and management. While convolutional neural networks (CNNs) demonstrate superior feature learning and prediction capabilities using image-level annotations, the inherent mixed-category nature of input image patches leads to classification errors along object boundaries. Fully convolutional neural networks (FCNs) excel at pixel-wise fine segmentation, making them less susceptible to heterogeneous content, but they require fully annotated dense image patches, which may not be readily available in real-world scenarios. This paper proposes an object-based semi-supervised spatial attention residual UNet (OS-ARU) model. First, multiscale segmentation is performed to obtain segments from a remote sensing image, and segments containing sample points are assigned the categories of the corresponding points, which are used to train the model. Then, the trained model predicts class probabilities for all segments. Each unlabeled segment's probability distribution is compared against those of labeled segments for similarity matching under a threshold constraint. Through label propagation, pseudo-labels are assigned to unlabeled segments exhibiting high similarity to labeled ones. Finally, the model is retrained using the augmented training set incorporating the pseudo-labeled segments. Comprehensive experiments on aerial image benchmarks for Vaihingen and Potsdam demonstrate that the proposed OS-ARU achieves higher classification accuracy than state-of-the-art models, including OCNN, 2OCNN, and standard OS-U, reaching an overall accuracy (OA) of 87.83% and 86.71%, respectively. The performance improvements over the baseline methods are statistically significant according to the Wilcoxon Signed-Rank Test. Despite using significantly fewer sparse annotations, this semi-supervised approach still achieves comparable accuracy to the same model under full supervision. The proposed method thus makes a step forward in substantially alleviating the heavy sampling burden of FCNs (densely sampled deep learning models) to effectively handle the complex issue of land cover information identification and classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Remote Sensing Image Dehazing via a Local Context-Enriched Transformer.
- Author
-
Nie, Jing, Xie, Jin, and Sun, Hanqing
- Subjects
- *
TRANSFORMER models , *REMOTE sensing , *CONVOLUTIONAL neural networks , *IMAGE reconstruction , *IMAGE processing - Abstract
Remote sensing image dehazing is a well-known remote sensing image processing task focused on restoring clean images from hazy images. The Transformer network, based on the self-attention mechanism, has demonstrated remarkable advantages in various image restoration tasks, due to its capacity to capture long-range dependencies within images. However, it is weak at modeling local context. Conversely, convolutional neural networks (CNNs) are adept at capturing local contextual information. Local contextual information could provide more details, while long-range dependencies capture global structure information. The combination of long-range dependencies and local context modeling is beneficial for remote sensing image dehazing. Therefore, in this paper, we propose a CNN-based adaptive local context enrichment module (ALCEM) to extract contextual information within local regions. Subsequently, we integrate our proposed ALCEM into the multi-head self-attention and feed-forward network of the Transformer, constructing a novel locally enhanced attention (LEA) and a local continuous-enhancement feed-forward network (LCFN). The LEA utilizes the ALCEM to inject local context information that is complementary to the long-range relationship modeled by multi-head self-attention, which is beneficial to removing haze and restoring details. The LCFN extracts multi-scale spatial information and selectively fuses them by the the ALCEM, which supplements more informative information compared with existing regular feed-forward networks with only position-specific information flow. Powered by the LEA and LCFN, a novel Transformer-based dehazing network termed LCEFormer is proposed to restore clear images from hazy remote sensing images, which combines the advantages of CNN and Transformer. Experiments conducted on three distinct datasets, namely DHID, ERICE, and RSID, demonstrate that our proposed LCEFormer achieves the state-of-the-art performance in hazy scenes. Specifically, our LCEFormer outperforms DCIL by 0.78 dB and 0.018 for PSNR and SSIM on the DHID dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Detection of Military Targets on Ground and Sea by UAVs with Low-Altitude Oblique Perspective.
- Author
-
Zeng, Bohan, Gao, Shan, Xu, Yuelei, Zhang, Zhaoxiang, Li, Fan, and Wang, Chenghang
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models - Abstract
Small-scale low-altitude unmanned aerial vehicles (UAVs) equipped with perception capability for military targets will become increasingly essential for strategic reconnaissance and stationary patrols in the future. To respond to challenges such as complex terrain and weather variations, as well as the deception and camouflage of military targets, this paper proposes a hybrid detection model that combines Convolutional Neural Network (CNN) and Transformer architecture in a decoupled manner. The proposed detector consists of the C-branch and the T-branch. In the C-branch, Multi-gradient Path Network (MgpNet) is introduced, inspired by the multi-gradient flow strategy, excelling in capturing the local feature information of an image. In the T-branch, RPFormer, a Region–Pixel two-stage attention mechanism, is proposed to aggregate the global feature information of the whole image. A feature fusion strategy is proposed to merge the feature layers of the two branches, further improving the detection accuracy. Furthermore, to better simulate real UAVs' reconnaissance environments, we construct a dataset of military targets in complex environments captured from an oblique perspective to evaluate the proposed detector. In ablation experiments, different fusion methods are validated, and the results demonstrate the effectiveness of the proposed fusion strategy. In comparative experiments, the proposed detector outperforms most advanced general detectors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Prediction of Sea Surface Temperature Using U-Net Based Model.
- Author
-
Ren, Jing, Wang, Changying, Sun, Ling, Huang, Baoxiang, Zhang, Deyu, Mu, Jiadong, and Wu, Jianqiang
- Subjects
- *
OCEAN temperature , *CONVOLUTIONAL neural networks - Abstract
Sea surface temperature (SST) is a key parameter in ocean hydrology. Currently, existing SST prediction methods fail to fully utilize the potential spatial correlation between variables. To address this challenge, we propose a spatiotenporal UNet (ST-UNet) model based on the UNet model. In particular, in the encoding phase of ST-UNet, we use parallel convolution with different kernel sizes to efficiently extract spatial features, and use ConvLSTM to capture temporal features based on the utilization of spatial features. Atrous Spatial Pyramid Pooling (ASPP) module is placed at the bottleneck of the network to further incorporate the multi-scale features, allowing the spatial features to be fully utilized. The final prediction is then generated in the decoding stage using parallel convolution with different kernel sizes similar to the encoding stage. We conducted a series of experiments on the Bohai Sea and Yellow Sea SST data set, as well as the South China Sea SST data set, using SST data from the past 35 days to predict SST data for 1, 3, and 7 days in the future. The model was trained using data spanning from 2010 to 2021, with data from 2022 being utilized to assess the model's predictive performance. The experimental results show that the model proposed in this research paper achieves excellent results at different prediction scales in both sea areas, and the model consistently outperforms other methods. Specifically, in the Bohai Sea and Yellow Sea sea areas, when the prediction scales are 1, 3, and 7 days, the MAE of ST-UNet outperforms the best results of the other three compared models by 17%, 12%, and 2%, and the MSE by 16%, 18%, and 9%, respectively. In the South China Sea, when the prediction ranges are 1, 3, and 7 days, the MAE of ST-UNet is 27%, 18%, and 3% higher than the best of the other three compared models, and the MSE is 46%, 39%, and 16% higher, respectively. Our results highlight the effectiveness of the ST-UNet model in capturing spatial correlations and accurately predicting SST. The proposed model is expected to improve marine hydrographic studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A Renovated Framework of a Convolution Neural Network with Transformer for Detecting Surface Changes from High-Resolution Remote-Sensing Images.
- Author
-
Yao, Shunyu, Wang, Han, Su, Yalu, Li, Qing, Sun, Tao, Liu, Changjun, Li, Yao, and Cheng, Deqiang
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *SURFACE of the earth , *FEATURE extraction , *REMOTE sensing - Abstract
Natural hazards are considered to have a strong link with climate change and human activities. With the rapid advancements in remote sensing technology, real-time monitoring and high-resolution remote-sensing images have become increasingly available, which provide precise details about the Earth's surface and enable prompt updates to support risk identification and management. This paper proposes a new network framework with Transformer architecture and a Residual network for detecting the changes in high-resolution remote-sensing images. The proposed model is trained using remote-sensing images from Shandong and Anhui Provinces of China in 2021 and 2022 while one district in 2023 is used to test the prediction accuracy. The performance of the proposed model is evaluated by using five matrices and further compared to both convention-based and attention-based models. The results demonstrated that the proposed structure integrates the great capability of conventional neural networks for image feature extraction with the ability to obtain global context from the attention mechanism, resulting in significant improvements in balancing positive sample identification while avoiding false positives in complex image change detection. Additionally, a toolkit supporting image preprocessing is developed for practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Ship Detection with Deep Learning in Optical Remote-Sensing Images: A Survey of Challenges and Advances.
- Author
-
Zhao, Tianqi, Wang, Yongcheng, Li, Zheng, Gao, Yunxiao, Chen, Chi, Feng, Hao, and Zhao, Zhikang
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *OPTICAL remote sensing , *OPTICAL images , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *FEATURE extraction - Abstract
Ship detection aims to automatically identify whether there are ships in the images, precisely classifies and localizes them. Regardless of whether utilizing early manually designed methods or deep learning technology, ship detection is dedicated to exploring the inherent characteristics of ships to enhance recall. Nowadays, high-precision ship detection plays a crucial role in civilian and military applications. In order to provide a comprehensive review of ship detection in optical remote-sensing images (SDORSIs), this paper summarizes the challenges as a guide. These challenges include complex marine environments, insufficient discriminative features, large scale variations, dense and rotated distributions, large aspect ratios, and imbalances between positive and negative samples. We meticulously review the improvement methods and conduct a detailed analysis of the strengths and weaknesses of these methods. We compile ship information from common optical remote sensing image datasets and compare algorithm performance. Simultaneously, we compare and analyze the feature extraction capabilities of backbones based on CNNs and Transformer, seeking new directions for the development in SDORSIs. Promising prospects are provided to facilitate further research in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Object Identification in Land Parcels Using a Machine Learning Approach.
- Author
-
Gundermann, Niels, Löwe, Welf, Fransson, Johan E. S., Olofsson, Erika, and Wehrenpfennig, Andreas
- Subjects
- *
MACHINE learning , *CONVOLUTIONAL neural networks , *IMAGE recognition (Computer vision) , *ARTIFICIAL intelligence , *LAND use - Abstract
This paper introduces an AI-based approach to detect human-made objects and changes in these on land parcels. To this end, we used binary image classification performed by a convolutional neural network. Binary classification requires the selection of a decision boundary, and we provided a deterministic method for this selection. Furthermore, we varied different parameters to improve the performance of our approach, leading to a true positive rate of 91.3% and a true negative rate of 63.0%. A specific application of our work supports the administration of agricultural land parcels eligible for subsidiaries. As a result of our findings, authorities could reduce the effort involved in the detection of human made changes by approximately 50%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. TransHSI: A Hybrid CNN-Transformer Method for Disjoint Sample-Based Hyperspectral Image Classification.
- Author
-
Zhang, Ping, Yu, Haiyang, Li, Pengao, and Wang, Ruili
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *CLASSIFICATION algorithms , *MULTISENSOR data fusion , *FEATURE extraction - Abstract
Hyperspectral images' (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these studies primarily incorporated Transformer blocks at the end of their network architectures. Due to significant differences between the spectral and spatial features in HSIs, the extraction of both global and local spectral–spatial features remains incomplete. To address this challenge, this paper introduces a novel method called TransHSI. This method incorporates a new spectral–spatial feature extraction module that leverages 3D CNNs to fuse Transformer to extract the local and global spectral features of HSIs, then combining 2D CNNs and Transformer to capture the local and global spatial features of HSIs comprehensively. Furthermore, a fusion module is proposed, which not only integrates the learned shallow and deep features of HSIs but also applies a semantic tokenizer to transform the fused features, enhancing the discriminative power of the features. This paper conducts experiments on three public datasets: Indian Pines, Pavia University, and Data Fusion Contest 2018. The training and test sets are selected based on a disjoint sampling strategy. We perform a comparative analysis with 11 traditional and advanced HSI classification algorithms. The experimental results demonstrate that the proposed method, TransHSI algorithm, achieves the highest overall accuracies and kappa coefficients, indicating a competitive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Self-Supervised Convolutional Neural Network Learning in a Hybrid Approach Framework to Estimate Chlorophyll and Nitrogen Content of Maize from Hyperspectral Images.
- Author
-
Gallo, Ignazio, Boschetti, Mirco, Rehman, Anwar Ur, and Candiani, Gabriele
- Subjects
- *
CONVOLUTIONAL neural networks , *BLENDED learning , *MACHINE learning , *SUPERVISED learning , *CHLOROPHYLL - Abstract
The new generation of available (i.e., PRISMA, ENMAP, DESIS) and future (i.e., ESA-CHIME, NASA-SBG) spaceborne hyperspectral missions provide unprecedented data for environmental and agricultural monitoring, such as crop trait assessment. This paper focuses on retrieving two crop traits, specifically Chlorophyll and Nitrogen content at the canopy level (CCC and CNC), starting from hyperspectral images acquired during the CHIME-RCS project, exploiting a self-supervised learning (SSL) technique. SSL is a machine learning paradigm that leverages unlabeled data to generate valuable representations for downstream tasks, bridging the gap between unsupervised and supervised learning. The proposed method comprises pre-training and fine-tuning procedures: in the first stage, a de-noising Convolutional Autoencoder is trained using pairs of noisy and clean CHIME-like images; the pre-trained Encoder network is utilized as-is or fine-tuned in the second stage. The paper demonstrates the applicability of this technique in hybrid approach methods that combine Radiative Transfer Modelling (RTM) and Machine Learning Regression Algorithm (MLRA) to set up a retrieval schema able to estimate crop traits from new generation space-born hyperspectral data. The results showcase excellent prediction accuracy for estimating CCC (R2 = 0.8318; RMSE = 0.2490) and CNC (R2 = 0.9186; RMSE = 0.7908) for maize crops from CHIME-like images without requiring further ground data calibration. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. PolSAR Image Classification with Active Complex-Valued Convolutional-Wavelet Neural Network and Markov Random Fields.
- Author
-
Liu, Lu and Li, Yongxiang
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *SPECKLE interference , *MARKOV random fields , *WAVELET transforms , *ACTIVE learning - Abstract
PolSAR image classification has attracted extensive significant research in recent decades. Aiming at improving PolSAR classification performance with speckle noise, this paper proposes an active complex-valued convolutional-wavelet neural network by incorporating dual-tree complex wavelet transform (DT-CWT) and Markov random field (MRF). In this approach, DT-CWT is introduced into the complex-valued convolutional neural network to suppress the speckle noise of PolSAR images and maintain the structures of learned feature maps. In addition, by applying active learning (AL), we iteratively select the most informative unlabeled training samples of PolSAR datasets. Moreover, MRF is utilized to obtain spatial local correlation information, which has been proven to be effective in improving classification performance. The experimental results on three benchmark PolSAR datasets demonstrate that the proposed method can achieve a significant classification performance gain in terms of its effectiveness and robustness beyond some state-of-the-art deep learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Joint Retrieval of Multiple Species of Ice Hydrometeor Parameters from Millimeter and Submillimeter Wave Brightness Temperature Based on Convolutional Neural Networks.
- Author
-
Chen, Ke, Wu, Jiasheng, and Chen, Yingying
- Subjects
- *
SUBMILLIMETER waves , *CONVOLUTIONAL neural networks , *BRIGHTNESS temperature , *MILLIMETER waves , *MONTE Carlo method , *ASTROCHEMISTRY - Abstract
Submillimeter wave radiometers are promising remote sensing tools for sounding ice cloud parameters. The Ice Cloud Imager (ICI) aboard the second generation of the EUMETSAT Polar System (EPS−SG) is the first operational submillimeter wave radiometer used for ice cloud remote sensing. Ice clouds simultaneously contain three species of ice hydrometeors—ice, snow, and graupel—the physical distributions and submillimeter wave radiation characteristics of which differ. Therefore, jointly retrieving the mass parameters of the three ice hydrometeors from submillimeter brightness temperatures is very challenging. In this paper, we propose a multiple species of ice hydrometeor parameters retrieval algorithm based on convolutional neural networks (CNNs) that can jointly retrieve the total content and vertical profiles of ice, snow, and graupel particles from submillimeter brightness temperatures. The training dataset is generated by a numerical weather prediction (NWP) model and a submillimeter wave radiative transfer (RT) model. In this study, an end to end ICI simulation experiment involving forward modeling of the brightness temperature and retrieval of ice cloud parameters was conducted to verify the effectiveness of the proposed CNN retrieval algorithm. Compared with the classical Unet, the average relative errors of the improved RCNN–ResUnet are reduced by 11%, 25%, and 18% in GWP, IWP, and SWP retrieval, respectively. Compared with Bayesian Monte Carlo integration algorithm, the average relative error of the total content retrieved by RCNN–ResUnet is reduced by 71%. Compared with BP neural network algorithm, the average relative error of the vertical profiles retrieved by RCNN–ResUnet is reduced by 69%. In addition, this algorithm was applied to actual Advanced Technology Microwave Sounder (ATMS) 183 GHz observed brightness temperatures to retrieve graupel particle parameters with a relative error in the total content of less than 25% and a relative error in the profile of less than 35%. The results show that the proposed CNN algorithm can be applied to future space borne submillimeter wave radiometers to jointly retrieve mass parameters of ice, snow, and graupel. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Forest Aboveground Biomass Estimation Using Multisource Remote Sensing Data and Deep Learning Algorithms: A Case Study over Hangzhou Area in China.
- Author
-
Tian, Xin, Li, Jiejie, Zhang, Fanyi, Zhang, Haibo, and Jiang, Mi
- Subjects
- *
DEEP learning , *BIOMASS estimation , *MACHINE learning , *MULTISPECTRAL imaging , *REMOTE sensing , *FOREST biomass , *CONVOLUTIONAL neural networks , *SYNTHETIC aperture radar - Abstract
The accurate estimation of forest aboveground biomass is of great significance for forest management and carbon balance monitoring. Remote sensing instruments have been widely applied in forest parameters inversion with wide coverage and high spatiotemporal resolution. In this paper, the capability of different remote-sensed imagery was investigated, including multispectral images (GaoFen-6, Sentinel-2 and Landsat-8) and various SAR (Synthetic Aperture Radar) data (GaoFen-3, Sentinel-1, ALOS-2), in aboveground forest biomass estimation. In particular, based on the forest inventory data of Hangzhou in China, the Random Forest (RF), Convolutional Neural Network (CNN) and Convolutional Neural Networks Long Short-Term Memory Networks (CNN-LSTM) algorithms were deployed to construct the forest biomass estimation models, respectively. The estimate accuracies were evaluated under the different configurations of images and methods. The results show that for the SAR data, ALOS-2 has a higher biomass estimation accuracy than the GaoFen-3 and Sentinel-1. Moreover, the GaoFen-6 data is slightly worse than Sentinel-2 and Landsat-8 optical data in biomass estimation. In contrast with the single source, integrating multisource data can effectively enhance accuracy, with improvements ranging from 5% to 10%. The CNN-LSTM generally performs better than CNN and RF, regardless of the data used. The combination of CNN-LSTM and multisource data provided the best results in this case and can achieve the maximum R2 value of up to 0.74. It was found that the majority of the biomass values in the study area in 2018 ranged from 60 to 90 Mg/ha, with an average value of 64.20 Mg/ha. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. CroplandCDNet: Cropland Change Detection Network for Multitemporal Remote Sensing Images Based on Multilayer Feature Transmission Fusion of an Adaptive Receptive Field.
- Author
-
Wu, Qiang, Huang, Liang, Tang, Bo-Hui, Cheng, Jiapei, Wang, Meiqi, and Zhang, Zixuan
- Subjects
- *
CONVOLUTIONAL neural networks , *CHANGE-point problems , *FARMS , *MARKOV random fields , *REMOTE-sensing images , *FEATURE extraction - Abstract
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the deep features of cropland, the accumulation of irrelevant features and the loss of key features will lead to poor detection results. To effectively solve this problem, a novel cropland change detection network (CroplandCDNet) is proposed in this paper; this network combines an adaptive receptive field and multiscale feature transmission fusion to achieve accurate detection of cropland change information. CroplandCDNet first effectively extracts the multiscale features of cropland from bitemporal remote sensing images through the feature extraction module and subsequently embeds the receptive field adaptive SK attention (SKA) module to emphasize cropland change. Moreover, the SKA module effectively uses spatial context information for the dynamic adjustment of the convolution kernel size of cropland features at different scales. Finally, multiscale features and difference features are transmitted and fused layer by layer to obtain the content of cropland change. In the experiments, the proposed method is compared with six advanced change detection methods using the cropland change detection dataset (CLCD). The experimental results show that CroplandCDNet achieves the best F1 and OA at 76.04% and 94.47%, respectively. Its precision and recall are second best of all models at 76.46% and 75.63%, respectively. Moreover, a generalization experiment was carried out using the Jilin-1 dataset, which effectively verified the reliability of CroplandCDNet in cropland change detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. AIDB-Net: An Attention-Interactive Dual-Branch Convolutional Neural Network for Hyperspectral Pansharpening.
- Author
-
Sun, Qian, Sun, Yu, and Pan, Chengsheng
- Subjects
- *
CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Despite notable advancements achieved on Hyperspectral (HS) pansharpening tasks through deep learning techniques, previous methods are inherently constrained by convolution or self-attention intrinsic defects, leading to limited performance. In this paper, we proposed an Attention-Interactive Dual-Branch Convolutional Neural Network (AIDB-Net) for HS pansharpening. Our model purely consists of convolutional layers and simultaneously inherits the strengths of both convolution and self-attention, especially the modeling of short- and long-range dependencies. Specially, we first extract, tokenize, and align the hyperspectral image (HSI) and panchromatic image (PAN) by Overlapping Patch Embedding Blocks. Then, we specialize a novel Spectral-Spatial Interactive Attention which is able to globally interact and fuse the cross-modality features. The resultant token-global similarity scores can guide the refinement and renewal of the textural details and spectral characteristics within HSI features. By deeply combined these two paradigms, our AIDB-Net significantly improve the pansharpening performance. Moreover, with the acceleration by the convolution inductive bias, our interactive attention can be trained without large scale dataset and achieves competitive time cost with its counterparts. Compared with the state-of-the-art methods, our AIDB-Net makes 5.2%, 3.1%, and 2.2% improvement on PSNR metric on three public datasets, respectively. Comprehensive experiments quantitatively and qualitatively demonstrate the effectiveness and superiority of our AIDB-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Learning Point Processes and Convolutional Neural Networks for Object Detection in Satellite Images.
- Author
-
Mabon, Jules, Ortner, Mathias, and Zerubia, Josiane
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *POINT processes , *REMOTE-sensing images , *GABOR filters , *ARTIFICIAL satellites - Abstract
Convolutional neural networks (CNN) have shown great results for object-detection tasks by learning texture and pattern-extraction filters. However, object-level interactions are harder to grasp without increasing the complexity of the architectures. On the other hand, Point Process models propose to solve the detection of the configuration of objects as a whole, allowing the factoring in of the image data and the objects' prior interactions. In this paper, we propose combining the information extracted by a CNN with priors on objects within a Markov Marked Point Process framework. We also propose a method to learn the parameters of this Energy-Based Model. We apply this model to the detection of small vehicles in optical satellite imagery, where the image information needs to be complemented with object interaction priors because of noise and small object sizes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. A Lightning Classification Method Based on Convolutional Encoding Features.
- Author
-
Zhu, Shunxing, Zhang, Yang, Fan, Yanfeng, Sun, Xiubin, Zheng, Dong, Zhang, Yijun, Lyu, Weitao, Zhang, Huiyi, and Wang, Jingxuan
- Subjects
- *
CONVOLUTIONAL neural networks , *RANDOM forest algorithms , *THUNDERSTORMS - Abstract
At present, for business lightning positioning systems, the classification of lightning discharge types is mostly based on lightning pulse signal features, and there is still a lot of room for improvement. We propose a lightning discharge classification method based on convolutional encoding features. This method utilizes convolutional neural networks to extract encoding features, and uses random forests to classify the extracted encoding features, achieving high accuracy discrimination for various lightning discharge events. Compared with traditional multi-parameter-based methods, the new method proposed in this paper has the ability to identify multiple lightning discharge events and does not require precise detailed feature engineering to extract individual pulse parameters. The accuracy of this method for identifying lightning discharge types in intra-cloud flash (IC), cloud-to-ground flash (CG), and narrow bipolar events (NBEs) is 97%, which is higher than that of multi-parameter methods. Moreover, our method can complete the classification task of lightning signals at a faster speed. Under the same conditions, the new method only requires 28.2 µs to identify one pulse, while deep learning-based methods require 300 µs. This method has faster recognition speed and higher accuracy in identifying multiple discharge types, which can better meet the needs of real-time business positioning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. An Overlay Accelerator of DeepLab CNN for Spacecraft Image Segmentation on FPGA.
- Author
-
Guo, Zibo, Liu, Kai, Liu, Wei, Sun, Xiaoyao, Ding, Chongyang, and Li, Shangrong
- Subjects
- *
IMAGE segmentation , *COMPILERS (Computer programs) , *SPACE vehicles , *CONVOLUTIONAL neural networks , *FIELD programmable gate arrays , *INSTRUCTION set architecture - Abstract
Due to the absence of communication and coordination with external spacecraft, non-cooperative spacecraft present challenges for the servicing spacecraft in acquiring information about their pose and location. The accurate segmentation of non-cooperative spacecraft components in images is a crucial step in autonomously sensing the pose of non-cooperative spacecraft. This paper presents a novel overlay accelerator of DeepLab Convolutional Neural Networks (CNNs) for spacecraft image segmentation on a FPGA. First, several software–hardware co-design aspects are investigated: (1) A CNNs-domain COD instruction set (Control, Operation, Data Transfer) is presented based on a Load–Store architecture to enable the implementation of accelerator overlays. (2) An RTL-based prototype accelerator is developed for the COD instruction set. The accelerator incorporates dedicated units for instruction decoding and dispatch, scheduling, memory management, and operation execution. (3) A compiler is designed that leverages tiling and operation fusion techniques to optimize the execution of CNNs, generating binary instructions for the optimized operations. Our accelerator is implemented on a Xilinx Virtex-7 XC7VX690T FPGA at 200 MHz. Experiments demonstrate that with INT16 quantization our accelerator achieves an accuracy (mIoU) of 77.84%, experiencing only a 0.2% degradation compared to that of the original fully precision model, in accelerating the segmentation model of DeepLabv3+ ResNet18 on the spacecraft component images (SCIs) dataset. The accelerator boasts a performance of 184.19 GOPS/s and a computational efficiency (Runtime Throughput/Theoretical Roof Throughput) of 88.72%. Compared to previous work, our accelerator improves performance by 1.5× and computational efficiency by 43.93%, all while consuming similar hardware resources. Additionally, in terms of instruction encoding, our instructions reduce the size by 1.5× to 49× when compiling the same model compared to previous work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Hyperspectral Image Classification Based on Mutually Guided Image Filtering.
- Author
-
Zhan, Ying, Hu, Dan, Yu, Xianchuan, and Wang, Yufeng
- Subjects
- *
IMAGE recognition (Computer vision) , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *FEATURE extraction , *GENERATIVE adversarial networks , *HYPERSPECTRAL imaging systems , *REMOTE sensing - Abstract
Hyperspectral remote sensing images (HSIs) have both spectral and spatial characteristics. The adept exploitation of these attributes is central to enhancing the classification accuracy of HSIs. In order to effectively utilize spatial and spectral features to classify HSIs, this paper proposes a method for the spatial feature extraction of HSIs based on a mutually guided image filter (muGIF) and combined with the band-distance-grouped principal component. Firstly, aiming at the problem that previously guided image filtering cannot effectively deal with the inconsistent information structure between the guided and target information, a method for extracting spatial features using muGIF is proposed. Then, aiming at the problem of the information loss caused by a single principal component as a guided image in the traditional GIF-based spatial–spectral classification, a spatial feature-extraction framework based on the band-distance-grouped principal component is proposed. The method groups the bands according to the band distance and extracts the principal components of each set of band subsets as the guide map of the current band subset to filter the HSIs. A deep convolutional neural network model and a generative adversarial network model for the filtered HSIs are constructed and then trained using samples for HSIs' spatial–spectral classification. Experiments show that compared with the traditional methods and several popular spatial–spectral HSI classification methods based on a filter, the proposed methods based on muGIF can effectively extract the spatial–spectral features and improve the classification accuracy of HSIs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Transfer-Learning-Based Human Activity Recognition Using Antenna Array.
- Author
-
Ye, Kun, Wu, Sheng, Cai, Yongbin, Zhou, Lang, Xiao, Lijun, Zhang, Xuebo, Zheng, Zheng, and Lin, Jiaqing
- Subjects
- *
HUMAN activity recognition , *ANTENNA arrays , *CONVOLUTIONAL neural networks , *ARRAY processing - Abstract
Due to its low cost and privacy protection, Channel-State-Information (CSI)-based activity detection has gained interest recently. However, to achieve high accuracy, which is challenging in practice, a significant number of training samples are required. To address the issues of the small sample size and cross-scenario in neural network training, this paper proposes a WiFi human activity-recognition system based on transfer learning using an antenna array: Wi-AR. First, the Intel5300 network card collects CSI signal measurements through an antenna array and processes them with a low-pass filter to reduce noise. Then, a threshold-based sliding window method is applied to extract the signal of independent activities, which is further transformed into time–frequency diagrams. Finally, the produced diagrams are used as input to a pretrained ResNet18 to recognize human activities. The proposed Wi-AR was evaluated using a dataset collected in three different room layouts. The testing results showed that the suggested Wi-AR recognizes human activities with a consistent accuracy of about 94%, outperforming the other conventional convolutional neural network approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Remote Sensing for Maritime Monitoring and Vessel Identification.
- Author
-
Salerno, Emanuele, Di Paola, Claudio, and Lo Duca, Angelica
- Subjects
- *
DEEP learning , *REMOTE sensing , *CONVOLUTIONAL neural networks , *SURVEILLANCE radar , *SYNTHETIC aperture radar , *INFORMATION technology , *PATTERN recognition systems - Abstract
This document explores the significance of remote sensing in monitoring maritime activities and identifying vessels. It emphasizes the need for surveillance to ensure safety, security, and emergency management, given the increasing number of vessels worldwide. The document highlights the use of technologies like the Automatic Identification System (AIS) and remote sensing in situations where collaborative systems are not reliable. It also discusses the integration of data from different sensors and the application of data science techniques for a comprehensive assessment of maritime traffic. The document concludes by summarizing research papers on ship detection, tracking, and classification using various sensors and data processing techniques. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
31. Multi-View Scene Classification Based on Feature Integration and Evidence Decision Fusion.
- Author
-
Zhou, Weixun, Shi, Yongxin, and Huang, Xiao
- Subjects
- *
FEATURE extraction , *IMAGE recognition (Computer vision) , *IMAGE fusion , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Leveraging multi-view remote sensing images in scene classification tasks significantly enhances the precision of such classifications. This approach, however, poses challenges due to the simultaneous use of multi-view images, which often leads to a misalignment between the visual content and semantic labels, thus complicating the classification process. In addition, as the number of image viewpoints increases, the quality problem for remote sensing images further limits the effectiveness of multi-view image classification. Traditional scene classification methods predominantly employ SoftMax deep learning techniques, which lack the capability to assess the quality of remote sensing images or to provide explicit explanations for the network's predictive outcomes. To address these issues, this paper introduces a novel end-to-end multi-view decision fusion network specifically designed for remote sensing scene classification. The network integrates information from multi-view remote sensing images under the guidance of image credibility and uncertainty, and when the multi-view image fusion process encounters conflicts, it greatly alleviates the conflicts and provides more reasonable and credible predictions for the multi-view scene classification results. Initially, multi-scale features are extracted from the multi-view images using convolutional neural networks (CNNs). Following this, an asymptotic adaptive feature fusion module (AAFFM) is constructed to gradually integrate these multi-scale features. An adaptive spatial fusion method is then applied to assign different spatial weights to the multi-scale feature maps, thereby significantly enhancing the model's feature discrimination capability. Finally, an evidence decision fusion module (EDFM), utilizing evidence theory and the Dirichlet distribution, is developed. This module quantitatively assesses the uncertainty in the multi-perspective image classification process. Through the fusing of multi-perspective remote sensing image information in this module, a rational explanation for the prediction results is provided. The efficacy of the proposed method was validated through experiments conducted on the AiRound and CV-BrCT datasets. The results show that our method not only improves single-view scene classification results but also advances multi-view remote sensing scene classification results by accurately characterizing the scene and mitigating the conflicting nature of the fusion process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. A CFAR-Enhanced Ship Detector for SAR Images Based on YOLOv5s.
- Author
-
Wen, Xue, Zhang, Shaoming, Wang, Jianmei, Yao, Tangjun, and Tang, Yan
- Subjects
- *
IMAGE recognition (Computer vision) , *IMAGE converters , *SYNTHETIC aperture radar , *TRAFFIC monitoring , *CONVOLUTIONAL neural networks , *RESEARCH vessels , *IMAGE analysis - Abstract
Ship detection and recognition in Synthetic Aperture Radar (SAR) images are crucial for maritime surveillance and traffic management. Limited availability of high-quality datasets hinders in-depth exploration of ship features in complex SAR images. While most existing SAR ship research is primarily based on Convolutional Neural Networks (CNNs), and although deep learning advances SAR image interpretation, it often prioritizes recognition over computational efficiency and underutilizes SAR image prior information. Therefore, this paper proposes YOLOv5s-based ship detection in SAR images. Firstly, for comprehensive detection enhancement, we employ the lightweight YOLOv5s model as the baseline. Secondly, we introduce a sub-net into YOLOv5s, learning traditional features to augment ship feature representation of Constant False Alarm Rate (CFAR). Additionally, we attempt to incorporate frequency-domain information into the channel attention mechanism to further improve detection. Extensive experiments on the Ship Recognition and Detection Dataset (SRSDDv1.0) in complex SAR scenarios confirm our method's 68.04% detection accuracy and 60.25% recall, with a compact 18.51 M model size. Our network surpasses peers in mAP, F1 score, model size, and inference speed, displaying robustness across diverse complex scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Remote Sensing Extraction of Lakes on the Tibetan Plateau Based on the Google Earth Engine and Deep Learning.
- Author
-
Pang, Yunxuan, Yu, Junchuan, Xi, Laidian, Ge, Daqing, Zhou, Ping, Hou, Changhong, He, Peng, and Zhao, Liu
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *REMOTE sensing , *WATER management , *LAKES , *WATER supply - Abstract
Lakes are an important component of global water resources. In order to achieve accurate lake extractions on a large scale, this study takes the Tibetan Plateau as the study area and proposes an Automated Lake Extraction Workflow (ALEW) based on the Google Earth Engine (GEE) and deep learning in response to the problems of a low lake identification accuracy and low efficiency in complex situations. It involves pre-processing massive images and creating a database of examples of lake extraction on the Tibetan Plateau. A lightweight convolutional neural network named LiteConvNet is constructed that makes it possible to obtain spatial–spectral features for accurate extractions while using less computational resources. We execute model training and online predictions using the Google Cloud platform, which leads to the rapid extraction of lakes over the whole Tibetan Plateau. We assess LiteConvNet, along with thresholding, traditional machine learning, and various open-source classification products, through both visual interpretation and quantitative analysis. The results demonstrate that the LiteConvNet model may greatly enhance the precision of lake extraction in intricate settings, achieving an overall accuracy of 97.44%. The method presented in this paper demonstrates promising capabilities in extracting lake information on a large scale, offering practical benefits for the remote sensing monitoring and management of water resources in cloudy and climate-differentiated regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Ship Formation Identification with Spatial Features and Deep Learning for HFSWR.
- Author
-
Wang, Jiaqi, Liu, Aijun, Yu, Changjun, and Ji, Yuanzheng
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *MACHINE learning , *SHIP models , *SHIPS - Abstract
Ship detection has been an area of focus for high-frequency surface wave radar (HFSWR). The detection and identification of ship formation have proven significant in early warning, while studies on the formation identification are limited due to the complex background and low resolution of HFSWR. In this paper, we first establish a spatial distribution model of ship formation in HFSWR. Then, we propose a cascade identification algorithm of ship formation in the clutter edge. The proposed algorithm includes a preprocessing stage and a two-stage formation identification stage. The Faster R-CNN is introduced in the preprocessing stage to locate the clutter regions. In the first stage, we propose an extremum detector based on connected regions to extract suspicious regions. The suspicious regions contain ship formations, single-ship targets, and false targets. In the second stage, we design a network connected by a convolutional neural network (CNN) and an extreme learning machine (ELM) to identify two densely distributed ship formations from inhomogeneous clutter and single-ship targets. The experimental results based on the factual HFSWR background demonstrate that the proposed cascade identification algorithm is superior to the extremum detector combined with the classical CNN algorithm for ship formation identification. Meanwhile, the proposed algorithm performs well in weak formation and deformed formation identification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Spatial-Spectral BERT for Hyperspectral Image Classification.
- Author
-
Ashraf, Mahmood, Zhou, Xichuan, Vivone, Gemine, Chen, Lihui, Chen, Rong, and Majdard, Reza Seifi
- Subjects
- *
IMAGE recognition (Computer vision) , *LANGUAGE models , *DEEP learning , *TRANSFORMER models , *CONVOLUTIONAL neural networks , *SPECTRAL imaging - Abstract
Several deep learning and transformer models have been recommended in previous research to deal with the classification of hyperspectral images (HSIs). Among them, one of the most innovative is the bidirectional encoder representation from transformers (BERT), which applies a distance-independent approach to capture the global dependency among all pixels in a selected region. However, this model does not consider the local spatial-spectral and spectral sequential relations. In this paper, a dual-dimensional (i.e., spatial and spectral) BERT (the so-called D2BERT) is proposed, which improves the existing BERT model by capturing more global and local dependencies between sequential spectral bands regardless of distance. In the proposed model, two BERT branches work in parallel to investigate relations among pixels and spectral bands, respectively. In addition, the layer intermediate information is used for supervision during the training phase to enhance the performance. We used two widely employed datasets for our experimental analysis. The proposed D2BERT shows superior classification accuracy and computational efficiency with respect to some state-of-the-art neural networks and the previously developed BERT model for this task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Hyperspectral Image Super-Resolution Based on Feature Diversity Extraction.
- Author
-
Zhang, Jing, Zheng, Renjie, Wan, Zekang, Geng, Ruijing, Wang, Yi, Yang, Yu, Zhang, Xuepeng, and Li, Yunsong
- Subjects
- *
DEEP learning , *FEATURE extraction , *IMAGE reconstruction algorithms , *HIGH resolution imaging , *CONVOLUTIONAL neural networks - Abstract
Deep learning is an important research topic in the field of image super-resolution. Problematically, the performance of existing hyperspectral image super-resolution networks is limited by feature learning for hyperspectral images. Nevertheless, the current algorithms exhibit some limitations in extracting diverse features. In this paper, we address limitations to existing hyperspectral image super-resolution networks, focusing on feature learning challenges. We introduce the Channel-Attention-Based Spatial–Spectral Feature Extraction network (CSSFENet) to enhance hyperspectral image feature diversity and optimize network loss functions. Our contributions include: (a) a convolutional neural network super-resolution algorithm incorporating diverse feature extraction to enhance the network's diversity feature learning by elevating the matrix rank, (b) a three-dimensional (3D) feature extraction convolution module, the Channel-Attention-Based Spatial–Spectral Feature Extraction Module (CSSFEM), to boost the network's performance in both the spatial and spectral domains, (c) a feature diversity loss function designed based on the image matrix's singular value to maximize element independence, and (d) a spatial–spectral gradient loss function introduced based on space and spectrum gradient values to enhance the reconstructed image's spatial–spectral smoothness. In contrast to existing hyperspectral super-resolution algorithms, we used four evaluation indexes, PSNR, mPSNR, SSIM, and SAM, and our method showed superiority during testing with three common hyperspectral datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Convolutional Neural Network-Based Method for Agriculture Plot Segmentation in Remote Sensing Images.
- Author
-
Qi, Liang, Zuo, Danfeng, Wang, Yirong, Tao, Ye, Tang, Runkang, Shi, Jiayu, Gong, Jiajun, and Li, Bangyu
- Subjects
- *
IMAGE segmentation , *REMOTE sensing , *REMOTE-sensing images , *LAND use , *FEATURE extraction , *AGRICULTURAL productivity - Abstract
Accurate delineation of individual agricultural plots, the foundational units for agriculture-based activities, is crucial for effective government oversight of agricultural productivity and land utilization. To improve the accuracy of plot segmentation in high-resolution remote sensing images, the paper collects GF-2 satellite remote sensing images, uses ArcGIS10.3.1 software to establish datasets, and builds UNet, SegNet, DeeplabV3+, and TransUNet neural network frameworks, respectively, for experimental analysis. Then, the TransUNet network with the best segmentation effects is optimized in both the residual module and the skip connection to further improve its performance for plot segmentation in high-resolution remote sensing images. This article introduces Deformable ConvNets in the residual module to improve the original ResNet50 feature extraction network and combines the convolutional block attention module (CBAM) at the skip connection to calculate and improve the skip connection steps. Experimental results indicate that the optimized remote sensing plot segmentation algorithm based on the TransUNet network achieves an Accuracy of 86.02%, a Recall of 83.32%, an F1-score of 84.67%, and an Intersection over Union (IOU) of 86.90%. Compared to the original TransUNet network for remote sensing land parcel segmentation, whose F1-S is 81.94% and whose IoU is 69.41%, the optimized TransUNet network has significantly improved the performance of remote sensing land parcel segmentation, which verifies the effectiveness and reliability of the plot segmentation algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Sensing and Deep CNN-Assisted Semi-Blind Detection for Multi-User Massive MIMO Communications.
- Author
-
Han, Fengxia, Zeng, Jin, Zheng, Le, Zhang, Hongming, and Wang, Jianhui
- Subjects
- *
CONVOLUTIONAL neural networks , *LOW-rank matrices , *MACHINE-to-machine communications , *SIGNAL denoising , *TELECOMMUNICATION systems , *MIMO radar , *COMPUTATIONAL complexity - Abstract
Attaining precise target detection and channel measurements are critical for guiding beamforming optimization and data demodulation in massive multiple-input multiple-output (MIMO) communication systems with hybrid structures, which requires large pilot overhead as well as substantial computational complexity. With benefits from the powerful detection characteristics of MIMO radar, we aim for designing a novel sensing-assisted semi-blind detection scheme in this paper, where both the inherent low-rankness of signal matrix and the essential knowledge about geometric environments are fully exploited under a designated cooperative manner. Specifically, to efficiently recover the channel factorizations via the formulated low-rank matrix completion problem, a low-complexity iterative algorithm stemming from the alternating steepest descent (ASD) method is adopted to obtain the solutions in case of unknown noise statistics. Moreover, we take one step forward by employing the denoising convolutional neural network (DnCNN) to preprocess the received signals due to its favorable performance of handling Gaussian denoising. The overall paradigm of our proposed scheme consists of three stages, namely (1) target parameter sensing, (2) communication signal denoising and (3) semi-blind detection refinement. Simulation results show that significant estimation gains can be achieved by the proposed scheme with reduced training overhead in a variety of system settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. A Lightweight Arbitrarily Oriented Detector Based on Transformers and Deformable Features for Ship Detection in SAR Images.
- Author
-
Chen, Bingji, Xue, Fengli, and Song, Hongjun
- Subjects
- *
TRANSFORMER models , *CONVOLUTIONAL neural networks , *SYNTHETIC aperture radar , *FEATURE extraction , *SHIPS - Abstract
Lightweight ship detection is an important application of synthetic aperture radar (SAR). The prevailing trend in recent research involves employing a detection framework based on convolutional neural networks (CNNs) and horizontal bounding boxes (HBBs). However, CNNs with local receptive fields fall short in acquiring adequate contextual information and exhibit sensitivity to noise. Moreover, HBBs introduce significant interference from both the background and adjacent ships. To overcome these limitations, this paper proposes a lightweight transformer-based method for detecting arbitrarily oriented ships in SAR images, called LD-Det, which excels at promptly and accurately identifying rotating ship targets. First, light pyramid vision transformer (LightPVT) is introduced as a lightweight backbone network. Built upon PVT v2-B0-Li, it effectively captures the long-range dependencies of ships in SAR images. Subsequently, multi-scale deformable feature pyramid network (MDFPN) is constructed as a neck network, utilizing the multi-scale deformable convolution (MDC) module to adjust receptive field regions and extract ship features from SAR images more effectively. Lastly, shared deformable head (SDHead) is proposed as a head network, enhancing ship feature extraction with the combination of deformable convolution operations and a shared parameter structure design. Experimental evaluations on two publicly available datasets validate the efficacy of the proposed method. Notably, the proposed method achieves state-of-the-art detection performance when compared with other lightweight methods in detecting rotated targets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Intelligent Environment-Adaptive GNSS/INS Integrated Positioning with Factor Graph Optimization.
- Author
-
Li, Zhengdao, Lee, Pin-Hsun, Hung, Tsz Hin Marcus, Zhang, Guohao, and Hsu, Li-Ta
- Subjects
- *
GLOBAL Positioning System , *CONVOLUTIONAL neural networks , *DEEP learning , *INTELLIGENT transportation systems , *STANDARD deviations , *INERTIAL navigation systems - Abstract
Global navigation satellite systems (GNSSs) applied to intelligent transport systems in urban areas suffer from multipath and non-line-of-sight (NLOS) effects due to the signal reflections from high-rise buildings, which seriously degrade the accuracy and reliability of vehicles in real-time applications. Accordingly, the integration between GNSS and inertial navigation systems (INSs) could be utilized to improve positioning performance. However, the fixed GNSS solution uncertainty of the conventional integration method cannot determine the fluctuating GNSS reliability in fast-changing urban environments. This weakness becomes solvable using a deep learning model for sensing the ambient environment intelligently, and it can be further mitigated using factor graph optimization (FGO), which is capable of generating robust solutions based on historical data. This paper mainly develops the adaptive GNSS/INS loosely coupled system on FGO, along with the fixed-gain Kalman filter (KF) and adaptive KF (AKF) being taken as comparisons. The adaptation is aided by a convolutional neural network (CNN), and the feasibility is verified using data from different grades of receivers. Compared with the integration using fixed-gain KF, the proposed adaptive FGO (AFGO) maintains the 100% positioning availability and reduces the overall 2D positioning error by up to 70% in the aspects of both root mean square error (RMSE) and standard deviation (STD). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Self-Adaptive-Filling Deep Convolutional Neural Network Classification Method for Mountain Vegetation Type Based on High Spatial Resolution Aerial Images.
- Author
-
Li, Shiou, Fei, Xianyun, Chen, Peilong, Wang, Zhen, Gao, Yajun, Cheng, Kai, Wang, Huilong, and Zhang, Yuanzhi
- Subjects
- *
CONVOLUTIONAL neural networks , *MOUNTAIN plants , *VEGETATION classification , *SPATIAL resolution , *REMOTE sensing - Abstract
The composition and structure of mountain vegetation are complex and changeable, and thus urgently require the integration of Object-Based Image Analysis (OBIA) and Deep Convolutional Neural Networks (DCNNs). However, while integration technology studies are continuing to increase, there have been few studies that have carried out the classification of mountain vegetation by combining OBIA and DCNNs, for it is difficult to obtain enough samples to trigger the potential of DCNNs for mountain vegetation type classification, especially using high-spatial-resolution remote sensing images. To address this issue, we propose a self-adaptive-filling method (SAF) to incorporate the OBIA method to improve the performance of DCNNs in mountain vegetation type classification using high-spatial-resolution aerial images. Using this method, SAF technology was employed to produce enough regular sample data for DCNNs by filling the irregular objects created by image segmenting using interior adaptive pixel blocks. Meanwhile, non-sample segmented image objects were shaped into different regular rectangular blocks via SAF. Then, the classification result was defined by voting combining the DCNN performance. Compared to traditional OBIA methods, SAF generates more samples for the DCNN and fully utilizes every single pixel of the DCNN input. We design experiments to compare them with traditional OBIA and semantic segmentation methods, such as U-net, MACU-net, and SegNeXt. The results show that our SAF-DCNN outperforms traditional OBIA in terms of accuracy and it is similar to the accuracy of the best performing method in semantic segmentation. However, it reduces the common pretzel phenomenon of semantic segmentation (black and white noise generated in classification). Overall, the SAF-based OBIA using DCNNs, which is proposed in this paper, is superior to other commonly used methods for vegetation classification in mountainous areas. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Deep Learning for Earthquake Disaster Assessment: Objects, Data, Models, Stages, Challenges, and Opportunities.
- Author
-
Jia, Jing and Ye, Wenjie
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *EARTHQUAKES , *GENERATIVE adversarial networks , *RECURRENT neural networks , *IMAGE recognition (Computer vision) - Abstract
Earthquake Disaster Assessment (EDA) plays a critical role in earthquake disaster prevention, evacuation, and rescue efforts. Deep learning (DL), which boasts advantages in image processing, signal recognition, and object detection, has facilitated scientific research in EDA. This paper analyses 204 articles through a systematic literature review to investigate the status quo, development, and challenges of DL for EDA. The paper first examines the distribution characteristics and trends of the two categories of EDA assessment objects, including earthquakes and secondary disasters as disaster objects, buildings, infrastructure, and areas as physical objects. Next, this study analyses the application distribution, advantages, and disadvantages of the three types of data (remote sensing data, seismic data, and social media data) mainly involved in these studies. Furthermore, the review identifies the characteristics and application of six commonly used DL models in EDA, including convolutional neural network (CNN), multi-layer perceptron (MLP), recurrent neural network (RNN), generative adversarial network (GAN), transfer learning (TL), and hybrid models. The paper also systematically details the application of DL for EDA at different times (i.e., pre-earthquake stage, during-earthquake stage, post-earthquake stage, and multi-stage). We find that the most extensive research in this field involves using CNNs for image classification to detect and assess building damage resulting from earthquakes. Finally, the paper discusses challenges related to training data and DL models, and identifies opportunities in new data sources, multimodal DL, and new concepts. This review provides valuable references for scholars and practitioners in related fields. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Multi-Featured Sea Ice Classification with SAR Image Based on Convolutional Neural Network.
- Author
-
Wan, Hongyang, Luo, Xiaowen, Wu, Ziyin, Qin, Xiaoming, Chen, Xiaolun, Li, Bin, Shang, Jihong, and Zhao, Dineng
- Subjects
- *
CONVOLUTIONAL neural networks , *SEA ice , *IMAGE recognition (Computer vision) , *SYNTHETIC aperture radar , *TIME-frequency analysis - Abstract
Sea ice is a significant factor in influencing environmental change on Earth. Monitoring sea ice is of major importance, and one of the main objectives of this monitoring is sea ice classification. Currently, synthetic aperture radar (SAR) data are primarily used for sea ice classification, with a single polarization band or simple combinations of polarization bands being common choices. While much of the current research has focused on optimizing network structures to achieve high classification accuracy, which requires substantial training resources, we aim to extract more information from the SAR data for classification. Therefore we propose a multi-featured SAR sea ice classification method that combines polarization features calculated by polarization decomposition and spectrogram features calculated by joint time-frequency analysis (JTFA). We built a convolutional neural network (CNN) structure for learning the multi-features of sea ice, which combines spatial features and physical properties, including polarization and spectrogram features of sea ice. In this paper, we utilized ALOS PALSAR SLC data with HH, HV, VH, and VV, four types of polarization for the multi-featured sea ice classification method. We divided the sea ice into new ice (NI), first-year ice (FI), old ice (OI), deformed ice (DI), and open water (OW). Then, the accuracy calculation by confusion matrix and comparative analysis were carried out. Our experimental results demonstrate that the multi-feature method proposed in this paper can achieve high accuracy with a smaller data volume and computational effort. In the four scenes selected for validation, the overall accuracy could reach 95%, 91%, 96%, and 95%, respectively, which represents a significant improvement compared to the single-feature sea ice classification method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. On-Board Multi-Class Geospatial Object Detection Based on Convolutional Neural Network for High Resolution Remote Sensing Images.
- Author
-
Shen, Yanyun, Liu, Di, Chen, Junyi, Wang, Zhipan, Wang, Zhe, and Zhang, Qingling
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *REMOTE sensing , *DATA transmission systems , *URBAN planning , *OPTICAL remote sensing - Abstract
Multi-class geospatial object detection in high-resolution remote sensing images has significant potential in various domains such as industrial production, military warning, disaster monitoring, and urban planning. However, the traditional process of remote sensing object detection involves several time-consuming steps, including image acquisition, image download, ground processing, and object detection. These steps may not be suitable for tasks with shorter timeliness requirements, such as military warning and disaster monitoring. Additionally, the transmission of massive data from satellites to the ground is limited by bandwidth, resulting in time delays and redundant information, such as cloud coverage images. To address these challenges and achieve efficient utilization of information, this paper proposes a comprehensive on-board multi-class geospatial object detection scheme. The proposed scheme consists of several steps. Firstly, the satellite imagery is sliced, and the PID-Net (Proportional-Integral-Derivative Network) method is employed to detect and filter out cloud-covered tiles. Subsequently, our Manhattan Intersection over Union (MIOU) loss-based YOLO (You Only Look Once) v7-Tiny method is used to detect remote-sensing objects in the remaining tiles. Finally, the detection results are mapped back to the original image, and the truncated NMS (Non-Maximum Suppression) method is utilized to filter out repeated and noisy boxes. To validate the reliability of the scheme, this paper creates a new dataset called DOTA-CD (Dataset for Object Detection in Aerial Images-Cloud Detection). Experiments were conducted on both ground and on-board equipment using the AIR-CD dataset, DOTA dataset, and DOTA-CD dataset. The results demonstrate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Hyperspectral Image Classification via Spatial Shuffle-Based Convolutional Neural Network.
- Author
-
Wang, Zhihui, Cao, Baisong, and Liu, Jun
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE recognition (Computer vision) , *SPECTRAL imaging - Abstract
The unique spatial–spectral integration characteristics of hyperspectral imagery (HSI) make it widely applicable in many fields. The spatial–spectral feature fusion-based HSI classification has always been a research hotspot. Typically, classification methods based on spatial–spectral features will select larger neighborhood windows to extract more spatial features for classification. However, this approach can also lead to the problem of non-independent training and testing sets to a certain extent. This paper proposes a spatial shuffle strategy that selects a smaller neighborhood window and randomly shuffles the pixels within the window. This strategy simulates the potential patterns of the pixel distribution in the real world as much as possible. Then, the samples of a three-dimensional HSI cube is transformed into two-dimensional images. Training with a simple CNN model that is not optimized for architecture can still achieve very high classification accuracy, indicating that the proposed method of this paper has considerable performance-improvement potential. The experimental results also indicate that the smaller neighborhood windows can achieve the same, or even better, classification performance compared to larger neighborhood windows. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Despeckling of SAR Images Using Residual Twin CNN and Multi-Resolution Attention Mechanism.
- Author
-
Pongrac, Blaž and Gleich, Dušan
- Subjects
- *
CONVOLUTIONAL neural networks , *SYNTHETIC aperture radar , *SPECKLE interference - Abstract
The despeckling of synthetic aperture radar images using two different convolutional neural network architectures is presented in this paper. The first method presents a novel Siamese convolutional neural network with a dilated convolutional network in each branch. Recently, attention mechanisms have been introduced to convolutional networks to better model and recognize features. Therefore, we propose a novel design for a convolutional neural network using an attention mechanism for an encoder–decoder-type network. The framework consists of a multiscale spatial attention network to improve the modeling of semantic information at different spatial levels and an additional attention mechanism to optimize feature propagation. Both proposed methods are different in design but they provide comparable despeckling results in subjective and objective measurements in terms of correlated speckle noise. The experimental results are evaluated on both synthetically generated speckled images and real SAR images. The methods proposed in this paper are able to despeckle SAR images and preserve SAR features. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. A CatBoost-Based Model for the Intensity Detection of Tropical Cyclones over the Western North Pacific Based on Satellite Cloud Images.
- Author
-
Zhong, Wei, Zhang, Deyuan, Sun, Yuan, and Wang, Qian
- Subjects
- *
TROPICAL cyclones , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *STANDARD deviations , *BRIGHTNESS temperature - Abstract
A CatBoost-based intelligent tropical cyclone (TC) intensity-detecting model was built to quantify the intensity of TCs over the Western North Pacific (WNP) with the cloud-top brightness temperature (CTBT) data of Fengyun-2F (FY-2F) and Fengyun-2G (FY-2G) and the best-track data of the China Meteorological Administration (CMA-BST) in recent years (2015–2018). The CatBoost-based model was featured with the greedy strategy of combination, the ordering principle in optimizing the possible gradient bias and prediction shift problems, and the oblivious tree in fast scoring. Compared with the previous studies based on the pure convolutional neural network (CNN) models, the CatBoost-based model exhibited better skills in detecting the TC intensity with the root mean square error (RMSE) of 3.74 m s−1. In addition to the three mentioned model features, there are also two reasons for the model's design. On one hand, the CatBoost-based model used the method of introducing prior physical factors (e.g., the structure and shape of the cloud, deep convections, and background fields) into its training process. On the other hand, the CatBoost-based model expanded the dataset size from 2342 to 13,471 samples through hourly interpolations of the original dataset. Furthermore, this paper investigated the errors of this model in detecting the different categories of TC intensity. The results showed that the deep learning-based TC intensity-detecting model proposed in this paper has systematic biases, namely, the overestimation (underestimation) of intensities in TCs which were weaker (stronger) than at the typhoon level, and the errors of the model in detecting weaker (stronger) TCs were smaller (larger). This implies that more factors than the CTBT should be included to further reduce the errors in detecting strong TCs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. HyperSFormer: A Transformer-Based End-to-End Hyperspectral Image Classification Method for Crop Classification.
- Author
-
Xie, Jiaxing, Hua, Jiajun, Chen, Shaonan, Wu, Peiwen, Gao, Peng, Sun, Daozong, Lyu, Zhendong, Lyu, Shilei, Xue, Xiuyun, and Lu, Jianqiang
- Subjects
- *
IMAGE recognition (Computer vision) , *TRANSFORMER models , *RECURRENT neural networks , *CONVOLUTIONAL neural networks , *CROP yields - Abstract
Crop classification of large-scale agricultural land is crucial for crop monitoring and yield estimation. Hyperspectral image classification has proven to be an effective method for this task. Most current popular hyperspectral image classification methods are based on image classification, specifically on convolutional neural networks (CNNs) and recurrent neural networks (RNNs). In contrast, this paper focuses on methods based on semantic segmentation and proposes a new transformer-based approach called HyperSFormer for crop hyperspectral image classification. The key enhancement of the proposed method is the replacement of the encoder in SegFormer with an improved Swin Transformer while keeping the SegFormer decoder. The entire model adopts a simple and uniform transformer architecture. Additionally, the paper introduces the hyper patch embedding (HPE) module to extract spectral and local spatial information from the hyperspectral images, which enhances the effectiveness of the features used as input for the model. To ensure detailed model processing and achieve end-to-end hyperspectral image classification, the transpose padding upsample (TPU) module is proposed for the model's output. In order to address the problem of insufficient and imbalanced samples in hyperspectral image classification, the paper designs an adaptive min log sampling (AMLS) strategy and a loss function that incorporates dice loss and focal loss to assist model training. Experimental results using three public hyperspectral image datasets demonstrate the strong performance of HyperSFormer, particularly in the presence of imbalanced sample data, complex negative samples, and mixed sample classes. HyperSFormer outperforms state-of-the-art methods, including fast patch-free global learning (FPGA), a spectral–spatial-dependent global learning framework (SSDGL), and SegFormer, by at least 2.7% in the mean intersection over union (mIoU). It also improves the overall accuracy and average accuracy values by at least 0.9% and 0.3%, respectively, and the kappa coefficient by at least 0.011. Furthermore, ablation experiments were conducted to determine the optimal hyperparameter and loss function settings for the proposed method, validating the rationality of these settings and the fusion loss function. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Estimation of the Two-Dimensional Direction of Arrival for Low-Elevation and Non-Low-Elevation Targets Based on Dilated Convolutional Networks.
- Author
-
Hu, Guoping, Zhao, Fangzheng, and Liu, Bingqi
- Subjects
- *
CONVOLUTIONAL neural networks , *DIRECTION of arrival estimation , *MIMO radar , *HUMAN fingerprints , *COVARIANCE matrices , *SIGNAL-to-noise ratio - Abstract
This paper addresses the problem of the two-dimensional direction-of-arrival (2D DOA) estimation of low-elevation or non-low-elevation targets using L-shaped uniform and sparse arrays by analyzing the signal models' features and their mapping to 2D DOA. This paper proposes a 2D DOA estimation algorithm based on the dilated convolutional network model, which consists of two components: a dilated convolutional autoencoder and a dilated convolutional neural network. If there are targets at low elevation, the dilated convolutional autoencoder suppresses the multipath signal and outputs a new signal covariance matrix as the input of the dilated convolutional neural network to directly perform 2D DOA estimation in the absence of a low-elevation target. The algorithm employs 3D convolution to fully retain and extract features. The simulation experiments and the analysis of their results revealed that for both L-shaped uniform and L-shaped sparse arrays, the dilated convolutional autoencoder could effectively suppress the multipath signals without affecting the direct wave and non-low-elevation targets, whereas the dilated convolutional neural network could effectively achieve 2D DOA estimation with a matching rate and an effective ratio of pitch and azimuth angles close to 100% without the need for additional parameter matching. Under the condition of a low signal-to-noise ratio, the estimation accuracy of the proposed algorithm was significantly higher than that of the traditional DOA estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. DAFCNN: A Dual-Channel Feature Extraction and Attention Feature Fusion Convolution Neural Network for SAR Image and MS Image Fusion.
- Author
-
Luo, Jiahao, Zhou, Fang, Yang, Jun, and Xing, Mengdao
- Subjects
- *
IMAGE fusion , *DEEP learning , *CONVOLUTIONAL neural networks , *FEATURE extraction , *MACHINE learning , *SYNTHETIC aperture radar , *SPATIAL ability - Abstract
In the field of image fusion, spatial detail blurring and color distortion appear in synthetic aperture radar (SAR) images and multispectral (MS) during the traditional fusion process due to the difference in sensor imaging mechanisms. To solve this problem, this paper proposes a fusion method for SAR images and MS images based on a convolutional neural network. In order to make use of the spatial information and different scale feature information of high-resolution SAR image, a dual-channel feature extraction module is constructed to obtain a SAR image feature map. In addition, different from the common direct addition strategy, an attention-based feature fusion module is designed to achieve spectral fidelity of the fused images. In order to obtain better spectral and spatial retention ability of the network, an unsupervised joint loss function is designed to train the network. In this paper, the Sentinel 1 SAR images and Landsat 8 MS images are used as datasets for experiments. The experimental results show that the proposed algorithm has better performance in quantitative and visual representation when compared with traditional fusion methods and deep learning algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.