2,184 results
Search Results
2. On-Board Multi-Class Geospatial Object Detection Based on Convolutional Neural Network for High Resolution Remote Sensing Images.
- Author
-
Shen, Yanyun, Liu, Di, Chen, Junyi, Wang, Zhipan, Wang, Zhe, and Zhang, Qingling
- Subjects
OBJECT recognition (Computer vision) ,CONVOLUTIONAL neural networks ,REMOTE-sensing images ,REMOTE sensing ,DATA transmission systems ,URBAN planning ,OPTICAL remote sensing - Abstract
Multi-class geospatial object detection in high-resolution remote sensing images has significant potential in various domains such as industrial production, military warning, disaster monitoring, and urban planning. However, the traditional process of remote sensing object detection involves several time-consuming steps, including image acquisition, image download, ground processing, and object detection. These steps may not be suitable for tasks with shorter timeliness requirements, such as military warning and disaster monitoring. Additionally, the transmission of massive data from satellites to the ground is limited by bandwidth, resulting in time delays and redundant information, such as cloud coverage images. To address these challenges and achieve efficient utilization of information, this paper proposes a comprehensive on-board multi-class geospatial object detection scheme. The proposed scheme consists of several steps. Firstly, the satellite imagery is sliced, and the PID-Net (Proportional-Integral-Derivative Network) method is employed to detect and filter out cloud-covered tiles. Subsequently, our Manhattan Intersection over Union (MIOU) loss-based YOLO (You Only Look Once) v7-Tiny method is used to detect remote-sensing objects in the remaining tiles. Finally, the detection results are mapped back to the original image, and the truncated NMS (Non-Maximum Suppression) method is utilized to filter out repeated and noisy boxes. To validate the reliability of the scheme, this paper creates a new dataset called DOTA-CD (Dataset for Object Detection in Aerial Images-Cloud Detection). Experiments were conducted on both ground and on-board equipment using the AIR-CD dataset, DOTA dataset, and DOTA-CD dataset. The results demonstrate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. A Method for Underwater Acoustic Target Recognition Based on the Delay-Doppler Joint Feature.
- Author
-
Du, Libin, Wang, Zhengkai, Lv, Zhichao, Han, Dongyue, Wang, Lei, Yu, Fei, and Lan, Qing
- Subjects
CONVOLUTIONAL neural networks ,ARCHITECTURAL acoustics ,OBJECT recognition (Computer vision) ,FOURIER transforms - Abstract
With the aim of solving the problem of identifying complex underwater acoustic targets using a single signal feature in the Time–Frequency (TF) feature, this paper designs a method that recognizes the underwater targets based on the Delay-Doppler joint feature. First, this method uses symplectic finite Fourier transform (SFFT) to extract the Delay-Doppler features of underwater acoustic signals, analyzes the Time–Frequency features at the same time, and combines the Delay-Doppler (DD) feature and Time–Frequency feature to form a joint feature (TF-DD). This paper uses three types of convolutional neural networks to verify that TF-DD can effectively improve the accuracy of target recognition. Secondly, this paper designs an object recognition model (TF-DD-CNN) based on joint features as input, which simplifies the neural network's overall structure and improves the model's training efficiency. This research employs ship-radiated noise to validate the efficacy of TF-DD-CNN for target identification. The results demonstrate that the combined characteristic and the TF-DD-CNN model introduced in this study can proficiently detect ships, and the model notably enhances the precision of detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. SFA-Net: Semantic Feature Adjustment Network for Remote Sensing Image Segmentation.
- Author
-
Hwang, Gyutae, Jeong, Jiwoo, and Lee, Sang Jun
- Subjects
CONVOLUTIONAL neural networks ,COMPUTER vision ,REMOTE sensing ,DEEP learning ,TRANSFORMER models - Abstract
Advances in deep learning and computer vision techniques have made impacts in the field of remote sensing, enabling efficient data analysis for applications such as land cover classification and change detection. Convolutional neural networks (CNNs) and transformer architectures have been utilized in visual perception algorithms due to their effectiveness in analyzing local features and global context. In this paper, we propose a hybrid transformer architecture that consists of a CNN-based encoder and transformer-based decoder. We propose a feature adjustment module that refines the multiscale feature maps extracted from an EfficientNet backbone network. The adjusted feature maps are integrated into the transformer-based decoder to perform the semantic segmentation of the remote sensing images. This paper refers to the proposed encoder–decoder architecture as a semantic feature adjustment network (SFA-Net). To demonstrate the effectiveness of the SFA-Net, experiments were thoroughly conducted with four public benchmark datasets, including the UAVid, ISPRS Potsdam, ISPRS Vaihingen, and LoveDA datasets. The proposed model achieved state-of-the-art accuracy on the UAVid, ISPRS Vaihingen, and LoveDA datasets for the segmentation of the remote sensing images. On the ISPRS Potsdam dataset, our method achieved comparable accuracy to the latest model while reducing the number of trainable parameters from 113.8 M to 10.7 M. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Editorial on Special Issue "3D Reconstruction and Mobile Mapping in Urban Environments Using Remote Sensing".
- Author
-
Jiang, San, Weng, Duojie, Liu, Jianchen, and Jiang, Wanshou
- Subjects
CONVOLUTIONAL neural networks ,SPHERICAL projection ,GEOGRAPHIC information systems ,STANDARD deviations ,GROUND penetrating radar ,DIGITAL photogrammetry ,SYNTHETIC aperture radar ,RAILROAD tunnels ,ROAD markings - Abstract
This document is an editorial on the special issue of "3D Reconstruction and Mobile Mapping in Urban Environments Using Remote Sensing." The editorial highlights the importance of 3D reconstruction and mobile mapping in various applications such as autonomous driving, smart logistics, pedestrian navigation, and virtual reality. It discusses the emergence of remote sensing-based techniques and cutting-edge technologies like SfM, SLAM, and deep learning that have enhanced the field. The special issue includes 15 high-quality papers covering topics such as image feature matching, LiDAR/image-fused SLAM, NeRF-based scene rendering, and other applications like InSAR point cloud registration and 3D GPR for underground imaging. The editorial concludes by expressing gratitude to the authors and reviewers for their contributions and highlighting the value of this special issue for further research. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
6. An Auditory Convolutional Neural Network for Underwater Acoustic Target Timbre Feature Extraction and Recognition.
- Author
-
Ni, Junshuai, Ji, Fang, Lu, Shaoqing, and Feng, Weijia
- Subjects
CONVOLUTIONAL neural networks ,BASILAR membrane ,FILTER banks ,AUDITORY perception ,AUDITORY selective attention - Abstract
In order to extract the line-spectrum features of underwater acoustic targets in complex environments, an auditory convolutional neural network (ACNN) with the ability of frequency component perception, timbre perception and critical information perception is proposed in this paper inspired by the human auditory perception mechanism. This model first uses a gammatone filter bank that mimics the cochlear basilar membrane excitation response to decompose the input time-domain signal into a number of sub-bands, which guides the network to perceive the line-spectrum frequency information of the underwater acoustic target. A sequence of convolution layers is then used to filter out interfering noise and enhance the line-spectrum components of each sub-band by simulating the process of calculating the energy distribution features, after which the improved channel attention module is connected to select line spectra that are more critical for recognition, and in this module, a new global pooling method is proposed and applied in order to better extract the intrinsic properties. Finally, the sub-band information is fused using a combination layer and a single-channel convolution layer to generate a vector with the same dimensions as the input signal at the output layer. A decision module with a Softmax classifier is added behind the auditory neural network and used to recognize the five classes of vessel targets in the ShipsEar dataset, achieving a recognition accuracy of 99.8%, which is improved by 2.7% compared to the last proposed DRACNN method, and there are different degrees of improvement over the other eight compared methods. The visualization results show that the model can significantly suppress the interfering noise intensity and selectively enhance the radiated noise line-spectrum energy of underwater acoustic targets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Remote Sensing for Maritime Monitoring and Vessel Identification.
- Author
-
Salerno, Emanuele, Di Paola, Claudio, and Lo Duca, Angelica
- Subjects
DEEP learning ,REMOTE sensing ,CONVOLUTIONAL neural networks ,SURVEILLANCE radar ,SYNTHETIC aperture radar ,INFORMATION technology ,PATTERN recognition systems - Abstract
This document explores the significance of remote sensing in monitoring maritime activities and identifying vessels. It emphasizes the need for surveillance to ensure safety, security, and emergency management, given the increasing number of vessels worldwide. The document highlights the use of technologies like the Automatic Identification System (AIS) and remote sensing in situations where collaborative systems are not reliable. It also discusses the integration of data from different sensors and the application of data science techniques for a comprehensive assessment of maritime traffic. The document concludes by summarizing research papers on ship detection, tracking, and classification using various sensors and data processing techniques. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
8. Vulnerable Road User Skeletal Pose Estimation Using mmWave Radars.
- Author
-
Zeng, Zhiyuan, Liang, Xingdong, Li, Yanlei, and Dang, Xiangwei
- Subjects
ROAD users ,TRACKING radar ,RADAR targets ,CONVOLUTIONAL neural networks ,RADAR signal processing ,DATA augmentation - Abstract
A skeletal pose estimation method, named RVRU-Pose, is proposed to estimate the skeletal pose of vulnerable road users based on distributed non-coherent mmWave radar. In view of the limitation that existing methods for skeletal pose estimation are only applicable to small scenes, this paper proposes a strategy that combines radar intensity heatmaps and coordinate heatmaps as input to a deep learning network. In addition, we design a multi-resolution data augmentation and training method suitable for radar to achieve target pose estimation for remote and multi-target application scenarios. Experimental results show that RVRU-Pose can achieve better than 2 cm average localization accuracy for different subjects in different scenarios, which is superior in terms of accuracy and time compared to existing state-of-the-art methods for human skeletal pose estimation with radar. As an essential performance parameter of radar, the impact of angular resolution on the estimation accuracy of a skeletal pose is quantitatively analyzed and evaluated in this paper. Finally, RVRU-Pose has also been extended to the task of estimating the skeletal pose of a cyclist, reflecting the strong scalability of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Deep-Learning-Based Daytime COT Retrieval and Prediction Method Using FY4A AGRI Data.
- Author
-
Xu, Fanming, Song, Biao, Chen, Jianhua, Guan, Runda, Zhu, Rongjie, Liu, Jiayu, and Qiu, Zhongfeng
- Subjects
CONVOLUTIONAL neural networks ,PREDICTION models ,DEEP learning ,FORECASTING - Abstract
The traditional method for retrieving cloud optical thickness (COT) is carried out through a Look-Up Table (LUT). Researchers must make a series of idealized assumptions and conduct extensive observations and record features in this scenario, consuming considerable resources. The emergence of deep learning effectively addresses the shortcomings of the traditional approach. In this paper, we first propose a daytime (SOZA < 70°) COT retrieval algorithm based on FY-4A AGRI. We establish and train a Convolutional Neural Network (CNN) model for COT retrieval, CM4CR, with the CALIPSO's COT product spatially and temporally synchronized as the ground truth. Then, a deep learning method extended from video prediction models is adopted to predict COT values based on the retrieval results obtained from CM4CR. The COT prediction model (CPM) consists of an encoder, a predictor, and a decoder. On this basis, we further incorporated a time embedding module to enhance the model's ability to learn from irregular time intervals in the input COT sequence. During the training phase, we employed Charbonnier Loss and Edge Loss to enhance the model's capability to represent COT details. Experiments indicate that our CM4CR outperforms existing COT retrieval methods, with predictions showing better performance across several metrics than other benchmark prediction models. Additionally, this paper also investigates the impact of different lengths of COT input sequences and the time intervals between adjacent frames of COT on prediction performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Transfer Learning-Based Specific Emitter Identification for ADS-B over Satellite System.
- Author
-
Liu, Mingqian, Chai, Yae, Li, Ming, Wang, Jiakun, and Zhao, Nan
- Subjects
CONVOLUTIONAL neural networks ,LOW earth orbit satellites ,AUTOMATIC dependent surveillance-broadcast ,HUMAN fingerprints ,FEATURE extraction ,DISTRIBUTED sensors - Abstract
In future aviation surveillance, the demand for higher real-time updates for global flights can be met by deploying automatic dependent surveillance–broadcast (ADS-B) receivers on low Earth orbit satellites, capitalizing on their global coverage and terrain-independent capabilities for seamless monitoring. Specific emitter identification (SEI) leverages the distinctive features of ADS-B data. High data collection and annotation costs, along with limited dataset size, can lead to overfitting during training and low model recognition accuracy. Transfer learning, which does not require source and target domain data to share the same distribution, significantly reduces the sensitivity of traditional models to data volume and distribution. It can also address issues related to the incompleteness and inadequacy of communication emitter datasets. This paper proposes a distributed sensor system based on transfer learning to address the specific emitter identification. Firstly, signal fingerprint features are extracted using a bispectrum transform (BST) to train a convolutional neural network (CNN) preliminarily. Decision fusion is employed to tackle the challenges of the distributed system. Subsequently, a transfer learning strategy is employed, incorporating frozen model parameters, maximum mean discrepancy (MMD), and classification error measures to reduce the disparity between the target and source domains. A hyperbolic space module is introduced before the output layer to enhance the expressive capacity and data information extraction. After iterative training, the transfer learning model is obtained. Simulation results confirm that this method enhances model generalization, addresses the issue of slow convergence, and leads to improved training accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Combining "Deep Learning" and Physically Constrained Neural Networks to Derive Complex Glaciological Change Processes from Modern High-Resolution Satellite Imagery: Application of the GEOCLASS-Image System to Create VarioCNN for Glacier Surges.
- Author
-
Herzfeld, Ute C., Hessburg, Lawrence J., Trantow, Thomas M., and Hayes, Adam N.
- Subjects
REMOTE-sensing images ,CONVOLUTIONAL neural networks ,DEEP learning ,GLACIERS ,IMAGE recognition (Computer vision) ,ACCELERATION (Mechanics) - Abstract
The objectives of this paper are to investigate the trade-offs between a physically constrained neural network and a deep, convolutional neural network and to design a combined ML approach ("VarioCNN"). Our solution is provided in the framework of a cyberinfrastructure that includes a newly designed ML software, GEOCLASS-image (v1.0), modern high-resolution satellite image data sets (Maxar WorldView data), and instructions/descriptions that may facilitate solving similar spatial classification problems. Combining the advantages of the physically-driven connectionist-geostatistical classification method with those of an efficient CNN, VarioCNN provides a means for rapid and efficient extraction of complex geophysical information from submeter resolution satellite imagery. A retraining loop overcomes the difficulties of creating a labeled training data set. Computational analyses and developments are centered on a specific, but generalizable, geophysical problem: The classification of crevasse types that form during the surge of a glacier system. A surge is a glacial catastrophe, an acceleration of a glacier to typically 100–200 times its normal velocity. GEOCLASS-image is applied to study the current (2016-2024) surge in the Negribreen Glacier System, Svalbard. The geophysical result is a description of the structural evolution and expansion of the surge, based on crevasse types that capture ice deformation in six simplified classes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Remote Sensing Crop Recognition by Coupling Phenological Features and Off-Center Bayesian Deep Learning.
- Author
-
Wu, Yongchuang, Wu, Penghai, Wu, Yanlan, Yang, Hui, and Wang, Biao
- Subjects
REMOTE sensing ,DEEP learning ,RECURRENT neural networks ,CONVOLUTIONAL neural networks ,AREA measurement - Abstract
Obtaining accurate and timely crop area information is crucial for crop yield estimates and food security. Because most existing crop mapping models based on remote sensing data have poor generalizability, they cannot be rapidly deployed for crop identification tasks in different regions. Based on a priori knowledge of phenology, we designed an off-center Bayesian deep learning remote sensing crop classification method that can highlight phenological features, combined with an attention mechanism and residual connectivity. In this paper, we first optimize the input image and input features based on a phenology analysis. Then, a convolutional neural network (CNN), recurrent neural network (RNN), and random forest classifier (RFC) were built based on farm data in northeastern Inner Mongolia and applied to perform comparisons with the method proposed here. Then, classification tests were performed on soybean, maize, and rice from four measurement areas in northeastern China to verify the accuracy of the above methods. To further explore the reliability of the method proposed in this paper, an uncertainty analysis was conducted by Bayesian deep learning to analyze the model's learning process and model structure for interpretability. Finally, statistical data collected in Suibin County, Heilongjiang Province, over many years, and Shandong Province in 2020 were used as reference data to verify the applicability of the methods. The experimental results show that the classification accuracy of the three crops reached 90.73% overall and the average F1 and IOU were 89.57% and 81.48%, respectively. Furthermore, the proposed method can be directly applied to crop area estimations in different years in other regions based on its good correlation with official statistics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Hardware-Aware Design of Speed-Up Algorithms for Synthetic Aperture Radar Ship Target Detection Networks.
- Author
-
Zhang, Yue, Jiang, Shuai, Cao, Yue, Xiao, Jiarong, Li, Chengkun, Zhou, Xuan, and Yu, Zhongjun
- Subjects
SYNTHETIC aperture radar ,RADAR targets ,SYNTHETIC apertures ,CONVOLUTIONAL neural networks ,SUCCESSIVE approximation analog-to-digital converters ,NAVAL architecture ,ALGORITHMS - Abstract
Recently, synthetic aperture radar (SAR) target detection algorithms based on Convolutional Neural Networks (CNN) have received increasing attention. However, the large amount of computation required burdens the real-time detection of SAR ship targets on resource-limited and power-constrained satellite-based platforms. In this paper, we propose a hardware-aware model speed-up method for single-stage SAR ship targets detection tasks, oriented towards the most widely used hardware for neural network computing—Graphic Processing Unit (GPU). We first analyze the process by which the task of detection is executed on GPUs and propose two strategies according to this process. Firstly, in order to speed up the execution of the model on a GPU, we propose SAR-aware model quantification to allow the original model to be stored and computed in a low-precision format. Next, to ensure the loss of accuracy is negligible after the acceleration and compression process, precision-aware scheduling is used to filter out layers that are not suitable for quantification and store and execute them in a high-precision mode. Trained on the dataset HRSID, the effectiveness of this model speed-up algorithm was demonstrated by compressing four different sizes of models (yolov5n, yolov5s, yolov5m, yolov5l). The experimental results show that the detection speeds of yolov5n, yolov5s, yolov5m, and yolov5l can reach 234.7785 fps, 212.8341 fps, 165.6523 fps, and 139.8758 fps on the NVIDIA AGX Xavier development board with negligible loss of accuracy, which is 1.230 times, 1.469 times, 1.955 times, and 2.448 times faster than the original before the use of this method, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Deep Learning for Earthquake Disaster Assessment: Objects, Data, Models, Stages, Challenges, and Opportunities.
- Author
-
Jia, Jing and Ye, Wenjie
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,EARTHQUAKES ,GENERATIVE adversarial networks ,RECURRENT neural networks ,IMAGE recognition (Computer vision) - Abstract
Earthquake Disaster Assessment (EDA) plays a critical role in earthquake disaster prevention, evacuation, and rescue efforts. Deep learning (DL), which boasts advantages in image processing, signal recognition, and object detection, has facilitated scientific research in EDA. This paper analyses 204 articles through a systematic literature review to investigate the status quo, development, and challenges of DL for EDA. The paper first examines the distribution characteristics and trends of the two categories of EDA assessment objects, including earthquakes and secondary disasters as disaster objects, buildings, infrastructure, and areas as physical objects. Next, this study analyses the application distribution, advantages, and disadvantages of the three types of data (remote sensing data, seismic data, and social media data) mainly involved in these studies. Furthermore, the review identifies the characteristics and application of six commonly used DL models in EDA, including convolutional neural network (CNN), multi-layer perceptron (MLP), recurrent neural network (RNN), generative adversarial network (GAN), transfer learning (TL), and hybrid models. The paper also systematically details the application of DL for EDA at different times (i.e., pre-earthquake stage, during-earthquake stage, post-earthquake stage, and multi-stage). We find that the most extensive research in this field involves using CNNs for image classification to detect and assess building damage resulting from earthquakes. Finally, the paper discusses challenges related to training data and DL models, and identifies opportunities in new data sources, multimodal DL, and new concepts. This review provides valuable references for scholars and practitioners in related fields. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Hyperspectral Image Classification via Spatial Shuffle-Based Convolutional Neural Network.
- Author
-
Wang, Zhihui, Cao, Baisong, and Liu, Jun
- Subjects
CONVOLUTIONAL neural networks ,IMAGE recognition (Computer vision) ,SPECTRAL imaging - Abstract
The unique spatial–spectral integration characteristics of hyperspectral imagery (HSI) make it widely applicable in many fields. The spatial–spectral feature fusion-based HSI classification has always been a research hotspot. Typically, classification methods based on spatial–spectral features will select larger neighborhood windows to extract more spatial features for classification. However, this approach can also lead to the problem of non-independent training and testing sets to a certain extent. This paper proposes a spatial shuffle strategy that selects a smaller neighborhood window and randomly shuffles the pixels within the window. This strategy simulates the potential patterns of the pixel distribution in the real world as much as possible. Then, the samples of a three-dimensional HSI cube is transformed into two-dimensional images. Training with a simple CNN model that is not optimized for architecture can still achieve very high classification accuracy, indicating that the proposed method of this paper has considerable performance-improvement potential. The experimental results also indicate that the smaller neighborhood windows can achieve the same, or even better, classification performance compared to larger neighborhood windows. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Locating and Grading of Lidar-Observed Aircraft Wake Vortex Based on Convolutional Neural Networks.
- Author
-
Zhang, Xinyu, Zhang, Hongwei, Wang, Qichao, Liu, Xiaoying, Liu, Shouxin, Zhang, Rongchuan, Li, Rongzhong, and Wu, Songhua
- Subjects
CONVOLUTIONAL neural networks ,DOPPLER lidar ,AERONAUTICAL safety measures - Abstract
Aircraft wake vortices are serious threats to aviation safety. The Pulsed Coherent Doppler Lidar (PCDL) has been widely used in the observation of aircraft wake vortices due to its advantages of high spatial-temporal resolution and high precision. However, the post-processing algorithms require significant computing resources, which cannot achieve the real-time detection of a wake vortex (WV). This paper presents an improved Convolutional Neural Network (CNN) method for WV locating and grading based on PCDL data to avoid the influence of unstable ambient wind fields on the localization and classification results of WV. Typical WV cases are selected for analysis, and the WV locating and grading models are validated on different test sets. The consistency of the analytical algorithm and the CNN algorithm is verified. The results indicate that the improved CNN method achieves satisfactory recognition accuracy with higher efficiency and better robustness, especially in the case of strong turbulence, where the CNN method recognizes the wake vortex while the analytical method cannot. The improved CNN method is expected to be applied to optimize the current aircraft spacing criteria, which is promising in terms of aviation safety and economic benefit improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer.
- Author
-
Wang, Minhui, Sun, Yaxiu, Xiang, Jianhong, Sun, Rui, and Zhong, Yu
- Subjects
TRANSFORMER models ,CONVOLUTIONAL neural networks ,LIDAR ,DIGITAL elevation models ,TRANSFER matrix ,DATA fusion (Statistics) - Abstract
Utilizing multi-modal data, as opposed to only hyperspectral image (HSI), enhances target identification accuracy in remote sensing. Transformers are applied to multi-modal data classification for their long-range dependency but often overlook intrinsic image structure by directly flattening image blocks into vectors. Moreover, as the encoder deepens, unprofitable information negatively impacts classification performance. Therefore, this paper proposes a learnable transformer with an adaptive gating mechanism (AGMLT). Firstly, a spectral–spatial adaptive gating mechanism (SSAGM) is designed to comprehensively extract the local information from images. It mainly contains point depthwise attention (PDWA) and asymmetric depthwise attention (ADWA). The former is for extracting spectral information of HSI, and the latter is for extracting spatial information of HSI and elevation information of LiDAR-derived rasterized digital surface models (LiDAR-DSM). By omitting linear layers, local continuity is maintained. Then, the layer Scale and learnable transition matrix are introduced to the original transformer encoder and self-attention to form the learnable transformer (L-Former). It improves data dynamics and prevents performance degradation as the encoder deepens. Subsequently, learnable cross-attention (LC-Attention) with the learnable transfer matrix is designed to augment the fusion of multi-modal data by enriching feature information. Finally, poly loss, known for its adaptability with multi-modal data, is employed in training the model. Experiments in the paper are conducted on four famous multi-modal datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and Houston2013 (HU). The results show that AGMLT achieves optimal performance over some existing models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. CroplandCDNet: Cropland Change Detection Network for Multitemporal Remote Sensing Images Based on Multilayer Feature Transmission Fusion of an Adaptive Receptive Field.
- Author
-
Wu, Qiang, Huang, Liang, Tang, Bo-Hui, Cheng, Jiapei, Wang, Meiqi, and Zhang, Zixuan
- Subjects
CONVOLUTIONAL neural networks ,CHANGE-point problems ,FARMS ,MARKOV random fields ,REMOTE-sensing images ,FEATURE extraction - Abstract
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the deep features of cropland, the accumulation of irrelevant features and the loss of key features will lead to poor detection results. To effectively solve this problem, a novel cropland change detection network (CroplandCDNet) is proposed in this paper; this network combines an adaptive receptive field and multiscale feature transmission fusion to achieve accurate detection of cropland change information. CroplandCDNet first effectively extracts the multiscale features of cropland from bitemporal remote sensing images through the feature extraction module and subsequently embeds the receptive field adaptive SK attention (SKA) module to emphasize cropland change. Moreover, the SKA module effectively uses spatial context information for the dynamic adjustment of the convolution kernel size of cropland features at different scales. Finally, multiscale features and difference features are transmitted and fused layer by layer to obtain the content of cropland change. In the experiments, the proposed method is compared with six advanced change detection methods using the cropland change detection dataset (CLCD). The experimental results show that CroplandCDNet achieves the best F1 and OA at 76.04% and 94.47%, respectively. Its precision and recall are second best of all models at 76.46% and 75.63%, respectively. Moreover, a generalization experiment was carried out using the Jilin-1 dataset, which effectively verified the reliability of CroplandCDNet in cropland change detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. LDnADMM-Net: A Denoising Unfolded Deep Neural Network for Direction-of-Arrival Estimations in A Low Signal-to-Noise Ratio.
- Author
-
Liang, Can, Liu, Mingxuan, Li, Yang, Wang, Yanhua, and Hu, Xueyao
- Subjects
DIRECTION of arrival estimation ,CONVOLUTIONAL neural networks ,SIGNAL-to-noise ratio ,COMPRESSED sensing ,SIGNAL denoising - Abstract
In this paper, we explore the problem of direction-of-arrival (DOA) estimation for a non-uniform linear array (NULA) under strong noise. The compressed sensing (CS)-based methods are widely used in NULA DOA estimations. However, these methods commonly rely on the tuning of parameters, which are hard to fine-tune. Additionally, these methods lack robustness under strong noise. To address these issues, this paper proposes a novel DOA estimation approach using a deep neural network (DNN) for a NULA in a low SNR. The proposed network is designed based on the denoising convolutional neural network (DnCNN) and the alternating direction method of multipliers (ADMM), which is dubbed as LDnADMM-Net. First, we construct an unfolded DNN architecture that mimics the behavior of the iterative processing of an ADMM. In this way, the parameters of an ADMM can be transformed into the network weights, and thus we can adaptively optimize these parameters through network training. Then, we employ the DnCNN to develop a denoising module (DnM) and integrate it into the unfolded DNN. Using this DnM, we can enhance the anti-noise ability of the proposed network and obtain a robust DOA estimation in a low SNR. The simulation and experimental results show that the proposed LDnADMM-Net can obtain high-accuracy and super-resolution DOA estimations for a NULA with strong robustness in a low signal-to-noise ratio (SNR). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. CPINet: Towards A Novel Cross-Polarimetric Interaction Network for Dual-Polarized SAR Ship Classification.
- Author
-
He, Jinglu, Sun, Ruiting, Kong, Yingying, Chang, Wenlong, Sun, Chenglu, Chen, Gaige, Li, Yinghua, Meng, Zhe, and Wang, Fuping
- Subjects
CONVOLUTIONAL neural networks ,AUTOMATIC target recognition ,SYNTHETIC aperture radar ,IMAGE recognition (Computer vision) ,FEATURE extraction ,DEEP learning - Abstract
With the rapid development of the modern world, it is imperative to achieve effective and efficient monitoring for territories of interest, especially for the broad ocean area. For surveillance of ship targets at sea, a common and powerful approach is to take advantage of satellite synthetic aperture radar (SAR) systems. Currently, using satellite SAR images for ship classification is a challenging issue due to complex sea situations and the imaging variances of ships. Fortunately, the emergence of advanced satellite SAR sensors has shed much light on the SAR ship automatic target recognition (ATR) task, e.g., utilizing dual-polarization (dual-pol) information to boost the performance of SAR ship classification. Therefore, in this paper we have developed a novel cross-polarimetric interaction network (CPINet) to explore the abundant polarization information of dual-pol SAR images with the help of deep learning strategies, leading to an effective solution for high-performance ship classification. First, we establish a novel multiscale deep feature extraction framework to fully mine the characteristics of dual-pol SAR images in a coarse-to-fine manner. Second, to further leverage the complementary information of dual-pol SAR images, we propose a mixed-order squeeze–excitation (MO-SE) attention mechanism, in which the first- and second-order statistics of the deep features from one single-polarized SAR image are extracted to guide the learning of another polarized one. Then, the intermediate multiscale fused and MO-SE augmented dual-polarized deep feature maps are respectively aggregated by the factorized bilinear coding (FBC) pooling method. Meanwhile, the last multiscale fused deep feature maps for each single-polarized SAR image are also individually aggregated by the FBC. Finally, four kinds of highly discriminative deep representations are obtained for loss computation and category prediction. For better network training, the gradient normalization (GradNorm) method for multitask networks is extended to adaptively balance the contribution of each loss component. Extensive experiments on the three- and five-category dual-pol SAR ship classification dataset collected from the open and free OpenSARShip database demonstrate the superiority and robustness of CPINet compared with state-of-the-art methods for the dual-polarized SAR ship classification task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Domain Adaptation for Satellite-Borne Multispectral Cloud Detection.
- Author
-
Du, Andrew, Doan, Anh-Dzung, Law, Yee Wei, and Chin, Tat-Jun
- Subjects
CONVOLUTIONAL neural networks ,MACHINE learning ,DATA transmission systems ,ALGORITHMS ,BANDWIDTHS ,MULTISPECTRAL imaging - Abstract
The advent of satellite-borne machine learning hardware accelerators has enabled the onboard processing of payload data using machine learning techniques such as convolutional neural networks (CNNs). A notable example is using a CNN to detect the presence of clouds in the multispectral data captured on Earth observation (EO) missions, whereby only clear sky data are downlinked to conserve bandwidth. However, prior to deployment, new missions that employ new sensors will not have enough representative datasets to train a CNN model, while a model trained solely on data from previous missions will underperform when deployed to process the data on the new missions. This underperformance stems from the domain gap, i.e., differences in the underlying distributions of the data generated by the different sensors in previous and future missions. In this paper, we address the domain gap problem in the context of onboard multispectral cloud detection. Our main contributions lie in formulating new domain adaptation tasks that are motivated by a concrete EO mission, developing a novel algorithm for bandwidth-efficient supervised domain adaptation, and demonstrating test-time adaptation algorithms on space deployable neural network accelerators. Our contributions enable minimal data transmission to be invoked (e.g., only 1% of the weights in ResNet50) to achieve domain adaptation, thereby allowing more sophisticated CNN models to be deployed and updated on satellites without being hampered by domain gap and bandwidth limitations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. MDFA-Net: Multi-Scale Differential Feature Self-Attention Network for Building Change Detection in Remote Sensing Images.
- Author
-
Li, Yuanling, Zou, Shengyuan, Zhao, Tianzhong, and Su, Xiaohui
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,FEATURE extraction ,REMOTE sensing ,URBAN studies - Abstract
Building change detection (BCD) from remote sensing images is an essential field for urban studies. In this well-developed field, Convolutional Neural Networks (CNNs) and Transformer have been leveraged to empower BCD models in handling multi-scale information. However, it is still challenging to accurately detect subtle changes using current models, which has been the main bottleneck to improving detection accuracy. In this paper, a multi-scale differential feature self-attention network (MDFA-Net) is proposed to effectively integrate CNN and Transformer by balancing the global receptive field from the self-attention mechanism and the local receptive field from convolutions. In MDFA-Net, two innovative modules were designed. Particularly, a hierarchical multi-scale dilated convolution (HMDConv) module was proposed to extract local features with hybrid dilation convolutions, which can ameliorate the effect of CNN's local bias. In addition, a differential feature self-attention (DFA) module was developed to implement the self-attention mechanism at multi-scale difference feature maps to overcome the problem that local details may be lost in the global receptive field in Transformer. The proposed MDFA-Net achieves state-of-the-art accuracy performance in comparison with related works, e.g., USSFC-Net, in three open datasets: WHU-CD, CDD-CD, and LEVIR-CD. Based on the experimental results, MDFA-Net significantly exceeds other models in F1 score, IoU, and overall accuracy; the F1 score is 93.81%, 95.52%, and 91.21% in WHU-CD, CDD-CD, and LEVIR-CD datasets, respectively. Furthermore, MDFA-Net achieved first or second place in precision and recall in the test in all three datasets, which indicates its better balance in precision and recall than other models. We also found that subtle changes, i.e., small-sized building changes and irregular boundary changes, are better detected thanks to the introduction of HMDConv and DFA. To this end, with its better ability to leverage multi-scale differential information than traditional methods, MDFA-Net provides a novel and effective avenue to integrate CNN and Transformer in BCD. Further studies could focus on improving the model's insensitivity to hyper-parameters and the model's generalizability in practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Enhancing Digital Twins with Human Movement Data: A Comparative Study of Lidar-Based Tracking Methods.
- Author
-
Karki, Shashank, Pingel, Thomas J., Baird, Timothy D., Flack, Addison, and Ogle, Todd
- Subjects
INDOOR positioning systems ,DIGITAL twins ,COMPUTER vision ,CONVOLUTIONAL neural networks ,HUMAN mechanics - Abstract
Digitals twins, used to represent dynamic environments, require accurate tracking of human movement to enhance their real-world application. This paper contributes to the field by systematically evaluating and comparing pre-existing tracking methods to identify strengths, weaknesses and practical applications within digital twin frameworks. The purpose of this study is to assess the efficacy of existing human movement tracking techniques for digital twins in real world environments, with the goal of improving spatial analysis and interaction within these virtual modes. We compare three approaches using indoor-mounted lidar sensors: (1) a frame-by-frame method deep learning model with convolutional neural networks (CNNs), (2) custom algorithms developed using OpenCV, and (3) the off-the-shelf lidar perception software package Percept version 1.6.3. Of these, the deep learning method performed best (F1 = 0.88), followed by Percept (F1 = 0.61), and finally the custom algorithms using OpenCV (F1 = 0.58). Each method had particular strengths and weaknesses, with OpenCV-based approaches that use frame comparison vulnerable to signal instability that is manifested as "flickering" in the dataset. Subsequent analysis of the spatial distribution of error revealed that both the custom algorithms and Percept took longer to acquire an identification, resulting in increased error near doorways. Percept software excelled in scenarios involving stationary individuals. These findings highlight the importance of selecting appropriate tracking methods for specific use. Future work will focus on model optimization, alternative data logging techniques, and innovative approaches to mitigate computational challenges, paving the way for more sophisticated and accessible spatial analysis tools. Integrating complementary sensor types and strategies, such as radar, audio levels, indoor positioning systems (IPSs), and wi-fi data, could further improve detection accuracy and validation while maintaining privacy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. GNSS-IR Soil Moisture Retrieval Using Multi-Satellite Data Fusion Based on Random Forest.
- Author
-
Jiang, Yao, Zhang, Rui, Sun, Bo, Wang, Tianyu, Zhang, Bo, Tu, Jinsheng, Nie, Shihai, Jiang, Hang, and Chen, Kangyi
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,GLOBAL Positioning System ,STANDARD deviations ,RADIAL basis functions - Abstract
The accuracy and reliability of soil moisture retrieval based on Global Positioning System (GPS) single-star Signal-to-Noise Ratio (SNR) data is low due to the influence of spatial and temporal differences of different satellites. Therefore, this paper proposes a Random Forest (RF)-based multi-satellite data fusion Global Navigation Satellite System Interferometric Reflectometry (GNSS-IR) soil moisture retrieval method, which utilizes the RF Model's Mean Decrease Impurity (MDI) algorithm to adaptively assign arc weights to fuse all available satellite data to obtain accurate retrieval results. Subsequently, the effectiveness of the proposed method was validated using GPS data from the Plate Boundary Observatory (PBO) network sites P041 and P037, as well as data collected in Lamasquere, France. A Support Vector Machine model (SVM), Radial Basis Function (RBF) neural network model, and Convolutional Neural Network model (CNN) are introduced for the comparison of accuracy. The results indicated that the proposed method had the best retrieval performance, with Root Mean Square Error (RMSE) values of 0.032, 0.028, and 0.003 cm
3 /cm3 , Mean Absolute Error (MAE) values of 0.025, 0.022, and 0.002 cm3 /cm3 , and correlation coefficients (R) of 0.94, 0.95, and 0.98, respectively, at the three sites. Therefore, the proposed soil moisture retrieval model demonstrates strong robustness and generalization capabilities, providing a reference for achieving high-precision, real-time monitoring of soil moisture. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
25. AMHFN: Aggregation Multi-Hierarchical Feature Network for Hyperspectral Image Classification.
- Author
-
Yang, Xiaofei, Luo, Yuxiong, Zhang, Zhen, Tang, Dong, Zhou, Zheng, and Tang, Haojin
- Subjects
CONVOLUTIONAL neural networks ,IMAGE recognition (Computer vision) ,DEEP learning ,TRANSFORMER models ,IMAGE fusion - Abstract
Deep learning methods like convolution neural networks (CNNs) and transformers are successfully applied in hyperspectral image (HSI) classification due to their ability to extract local contextual features and explore global dependencies, respectively. However, CNNs struggle in modeling long-term dependencies, and transformers may miss subtle spatial-spectral features. To address these challenges, this paper proposes an innovative hybrid HSI classification method aggregating hierarchical spatial-spectral features from a CNN and long pixel dependencies from a transformer. The proposed aggregation multi-hierarchical feature network (AMHFN) is designed to capture various hierarchical features and long dependencies from HSI, improving classification accuracy and efficiency. The proposed AMHFN consists of three key modules: (a) a Local-Pixel Embedding module (LPEM) for capturing prominent spatial-spectral features; (b) a Multi-Scale Convolutional Extraction (MSCE) module to capture multi-scale local spatial-spectral features and aggregate hierarchical local features; (c) a Multi-Scale Global Extraction (MSGE) module to explore multi-scale global dependencies and integrate multi-scale hierarchical global dependencies. Rigorous experiments on three public hyperspectral image (HSI) datasets demonstrated the superior performance of the proposed AMHFN method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. A Multi-Task Convolutional Neural Network Relative Radiometric Calibration Based on Temporal Information.
- Author
-
Tang, Lei, Zhao, Xiangang, Hu, Xiuqing, Luo, Chuyao, and Lin, Manjun
- Subjects
CONVOLUTIONAL neural networks ,RADIOMETRIC methods ,COMPUTER vision ,REMOTE-sensing images ,COMPUTER simulation ,DEEP learning - Abstract
Due to the continuous degradation of onboard satellite instruments over time, satellite images undergo degradation, necessitating calibration for tasks reliant on satellite data. The previous relative radiometric calibration methods are mainly categorized into traditional methods and deep learning methods. The traditional methods involve complex computations for each calibration, while deep-learning-based approaches tend to oversimplify the calibration process, utilizing generic computer vision models without tailored structures for calibration tasks. In this paper, we address the unique challenges of calibration by introducing a novel approach: a multi-task convolutional neural network calibration model leveraging temporal information. This pioneering method is the first to integrate temporal dynamics into the architecture of neural network calibration models. Extensive experiments conducted on the FY3A/B/C VIRR datasets showcase the superior performance of our approach compared to the existing state-of-the-art traditional and deep learning methods. Furthermore, tests with various backbones confirm the broad applicability of our framework across different convolutional neural networks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Modeling and Forecasting Ionospheric foF2 Variation Based on CNN-BiLSTM-TPA during Low- and High-Solar Activity Years.
- Author
-
Xu, Baoyi, Huang, Wenqiang, Ren, Peng, Li, Yi, and Xiang, Zheng
- Subjects
MACHINE learning ,CONVOLUTIONAL neural networks ,SOLAR activity ,IONOSPHERE ,PREDICTION models ,SOLAR cycle - Abstract
The transmission of high-frequency signals over long distances depends on the ionosphere's reflective properties, with the selection of operating frequencies being closely tied to variations in the ionosphere. The accurate prediction of ionospheric critical frequency foF2 and other parameters in low latitudes is of great significance for understanding ionospheric changes in high-frequency communications. Currently, deep learning algorithms demonstrate significant advantages in capturing characteristics of the ionosphere. In this paper, a state-of-the-art hybrid neural network is utilized in conjunction with a temporal pattern attention mechanism for predicting variations in the foF2 parameter during high- and low-solar activity years. Convolutional neural networks (CNNs) and bidirectional long short-term memory (BiLSTM), which is capable of extracting spatiotemporal features of ionospheric variations, are incorporated into a hybrid neural network. The foF2 data used for training and testing come from three observatories in Brisbane (27°53′S, 152°92′E), Darwin (12°45′S, 130°95′E) and Townsville (19°63′S, 146°85′E) in 2000, 2008, 2009 and 2014 (the peak or trough years of solar activity in solar cycles 23 and 24), using the advanced Australian Digital Ionospheric Sounder. The results show that the proposed model accurately captures the changes in ionospheric foF2 characteristics and outperforms International Reference Ionosphere 2020 (IRI-2020) and BiLSTM ionospheric prediction models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Ionospheric TEC Prediction in China during Storm Periods Based on Deep Learning: Mixed CNN-BiLSTM Method.
- Author
-
Ren, Xiaochen, Zhao, Biqiang, Ren, Zhipeng, and Xiong, Bo
- Subjects
CONVOLUTIONAL neural networks ,METEOROLOGICAL research ,HISTORICAL maps ,STORMS ,DEEP learning - Abstract
Applying deep learning to high-precision ionospheric parameter prediction is a significant and growing field within the realm of space weather research. This paper proposes an improved model, Mixed Convolutional Neural Network (CNN)—Bidirectional Long Short-Term Memory (BiLSTM), for predicting the Total Electron Content (TEC) in China. This model was trained using the longest available Global Ionospheric Maps (GIM)-TEC from 1998 to 2023 in China, and underwent an interpretability analysis and accuracy evaluation. The results indicate that historical TEC maps play the most critical role, followed by Kp, ap, AE, F10.7, and time factor. The contributions of Dst and Disturbance Index (DI) to improving accuracy are relatively small but still essential. In long-term predictions, the contributions of the geomagnetic index, solar activity index, and time factor are higher. In addition, the model performs well in short-term predictions, accurately capturing the occurrence, evolution, and classification of ionospheric storms. However, as the predicted length increases, the accuracy gradually decreases, and some erroneous predictions may occur. The northeast region exhibits lower accuracy but a higher F1 score, which may be attributed to the frequency of ionospheric storm occurrences in different locations. Overall, the model effectively predicts the trends and evolution processes of ionospheric storms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Cross-Hopping Graph Networks for Hyperspectral–High Spatial Resolution (H 2) Image Classification.
- Author
-
Chen, Tao, Wang, Tingting, Chen, Huayue, Zheng, Bochuan, and Deng, Wu
- Subjects
CONVOLUTIONAL neural networks ,IMAGE recognition (Computer vision) ,FEATURE extraction ,REMOTE sensing ,IMAGE fusion ,MULTISPECTRAL imaging - Abstract
As we take stock of the contemporary issue, remote sensing images are gradually advancing towards hyperspectral–high spatial resolution (H
2 ) double-high images. However, high resolution produces serious spatial heterogeneity and spectral variability while improving image resolution, which increases the difficulty of feature recognition. So as to make the best of spectral and spatial features under an insufficient number of marking samples, we would like to achieve effective recognition and accurate classification of features in H2 images. In this paper, a cross-hop graph network for H2 image classification(H2 -CHGN) is proposed. It is a two-branch network for deep feature extraction geared towards H2 images, consisting of a cross-hop graph attention network (CGAT) and a multiscale convolutional neural network (MCNN): the CGAT branch utilizes the superpixel information of H2 images to filter samples with high spatial relevance and designate them as the samples to be classified, then utilizes the cross-hop graph and attention mechanism to broaden the range of graph convolution to obtain more representative global features. As another branch, the MCNN uses dual convolutional kernels to extract features and fuse them at various scales while attaining pixel-level multi-scale local features by parallel cross connecting. Finally, the dual-channel attention mechanism is utilized for fusion to make image elements more prominent. This experiment on the classical dataset (Pavia University) and double-high (H2 ) datasets (WHU-Hi-LongKou and WHU-Hi-HongHu) shows that the H2 -CHGN can be efficiently and competently used in H2 image classification. In detail, experimental results showcase superior performance, outpacing state-of-the-art methods by 0.75–2.16% in overall accuracy. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
30. A Global Spatial-Spectral Feature Fused Autoencoder for Nonlinear Hyperspectral Unmixing.
- Author
-
Zhang, Mingle, Yang, Mingyu, Xie, Hongyu, Yue, Pinliang, Zhang, Wei, Jiao, Qingbin, Xu, Liang, and Tan, Xin
- Subjects
CONVOLUTIONAL neural networks ,DATA mining ,FEATURE extraction ,PIXELS ,NOISE ,DEEP learning - Abstract
Hyperspectral unmixing (HU) aims to decompose mixed pixels into a set of endmembers and corresponding abundances. Deep learning-based HU methods are currently a hot research topic, but most existing unmixing methods still rely on per-pixel training or employ convolutional neural networks (CNNs), which overlook the non-local correlations of materials and spectral characteristics. Furthermore, current research mainly focuses on linear mixing models, which limits the feature extraction capability of deep encoders and further improvement in unmixing accuracy. In this paper, we propose a nonlinear unmixing network capable of extracting global spatial-spectral features. The network is designed based on an autoencoder architecture, where a dual-stream CNNs is employed in the encoder to separately extract spectral and local spatial information. The extracted features are then fused together to form a more complete representation of the input data. Subsequently, a linear projection-based multi-head self-attention mechanism is applied to capture global contextual information, allowing for comprehensive spatial information extraction while maintaining lightweight computation. To achieve better reconstruction performance, a model-free nonlinear mixing approach is adopted to enhance the model's universality, with the mixing model learned entirely from the data. Additionally, an initialization method based on endmember bundles is utilized to reduce interference from outliers and noise. Comparative results on real datasets against several state-of-the-art unmixing methods demonstrate the superior of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. 1D-CNN-Transformer for Radar Emitter Identification and Implemented on FPGA.
- Author
-
Gao, Xiangang, Wu, Bin, Li, Peng, and Jing, Zehuan
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,FIELD programmable gate arrays ,CONVOLUTIONAL neural networks ,ENERGY consumption - Abstract
Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to the requirements of the low power consumption and high-performance processing of SEI on embedded devices, so this article proposes solutions from the aspects of software and hardware. From the software side, we design a Transformer variant network, lightweight convolutional Transformer (LW-CT) that supports parameter sharing. Then, we cascade convolutional neural networks (CNNs) and the LW-CT to construct a one-dimensional-CNN-Transformer(1D-CNN-Transformer) lightweight neural network model that can capture the long-range dependencies of radar emitter signals and extract signal spatial domain features meanwhile. In terms of hardware, we design a low-power neural network accelerator based on an FPGA to complete the real-time recognition of radar emitter signals. The accelerator not only designs high-efficiency computing engines for the network, but also devises a reconfigurable buffer called "Ping-pong CBUF" and two-level pipeline architecture for the convolution layer for alleviating the bottleneck caused by the off-chip storage access bandwidth. Experimental results show that the algorithm can achieve a high recognition performance of SEI with a low calculation overhead. In addition, the hardware acceleration platform not only perfectly meets the requirements of the radar emitter recognition system for low power consumption and high-performance processing, but also outperforms the accelerators in other papers in terms of the energy efficiency ratio of Transformer layer processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Pyramid Cascaded Convolutional Neural Network with Graph Convolution for Hyperspectral Image Classification.
- Author
-
Pan, Haizhu, Yan, Hui, Ge, Haimiao, Wang, Liguo, and Shi, Cuiping
- Subjects
CONVOLUTIONAL neural networks ,IMAGE recognition (Computer vision) ,FEATURE extraction ,COMPARATIVE method ,PYRAMIDS - Abstract
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have made considerable advances in hyperspectral image (HSI) classification. However, most CNN-based methods learn features at a single-scale in HSI data, which may be insufficient for multi-scale feature extraction in complex data scenes. To learn the relations among samples in non-grid data, GCNs are employed and combined with CNNs to process HSIs. Nevertheless, most methods based on CNN-GCN may overlook the integration of pixel-wise spectral signatures. In this paper, we propose a pyramid cascaded convolutional neural network with graph convolution (PCCGC) for hyperspectral image classification. It mainly comprises CNN-based and GCN-based subnetworks. Specifically, in the CNN-based subnetwork, a pyramid residual cascaded module and a pyramid convolution cascaded module are employed to extract multiscale spectral and spatial features separately, which can enhance the robustness of the proposed model. Furthermore, an adaptive feature-weighted fusion strategy is utilized to adaptively fuse multiscale spectral and spatial features. In the GCN-based subnetwork, a band selection network (BSNet) is used to learn the spectral signatures in the HSI using nonlinear inter-band dependencies. Then, the spectral-enhanced GCN module is utilized to extract and enhance the important features in the spectral matrix. Subsequently, a mutual-cooperative attention mechanism is constructed to align the spectral signatures between BSNet-based matrix with the spectral-enhanced GCN-based matrix for spectral signature integration. Abundant experiments performed on four widely used real HSI datasets show that our model achieves higher classification accuracy than the fourteen other comparative methods, which shows the superior classification performance of PCCGC over the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. AerialFormer: Multi-Resolution Transformer for Aerial Image Segmentation.
- Author
-
Hanyu, Taisei, Yamazaki, Kashu, Tran, Minh, McCann, Roy A., Liao, Haitao, Rainwater, Chase, Adkins, Meredith, Cothren, Jackson, and Le, Ngan
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,PROCESS capability ,IMAGE segmentation ,REMOTE sensing - Abstract
When performing remote sensing image segmentation, practitioners often encounter various challenges, such as a strong imbalance in the foreground–background, the presence of tiny objects, high object density, intra-class heterogeneity, and inter-class homogeneity. To overcome these challenges, this paper introduces AerialFormer, a hybrid model that strategically combines the strengths of Transformers and Convolutional Neural Networks (CNNs). AerialFormer features a CNN Stem module integrated to preserve low-level and high-resolution features, enhancing the model's capability to process details of aerial imagery. The proposed AerialFormer is designed with a hierarchical structure, in which a Transformer encoder generates multi-scale features and a multi-dilated CNN (MDC) decoder aggregates the information from the multi-scale inputs. As a result, information is taken into account in both local and global contexts, so that powerful representations and high-resolution segmentation can be achieved. The proposed AerialFormer was benchmarked on three benchmark datasets, including iSAID, LoveDA, and Potsdam. Comprehensive experiments and extensive ablation studies show that the proposed AerialFormer remarkably outperforms state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. MGCET: MLP-mixer and Graph Convolutional Enhanced Transformer for Hyperspectral Image Classification.
- Author
-
Al-qaness, Mohammed A. A., Wu, Guoyong, and AL-Alimi, Dalal
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,DATA mining ,IMAGE recognition (Computer vision) ,FEATURE extraction ,SPECTRAL imaging - Abstract
The vision transformer (ViT) has demonstrated performance comparable to that of convolutional neural networks (CNN) in the hyperspectral image classification domain. This is achieved by transforming images into sequence data and mining global spectral-spatial information to establish remote dependencies. Nevertheless, both the ViT and CNNs have their own limitations. For instance, a CNN is constrained by the extent of its receptive field, which prevents it from fully exploiting global spatial-spectral features. Conversely, the ViT is prone to excessive distraction during the feature extraction process. To be able to overcome the problem of insufficient feature information extraction caused using by a single paradigm, this paper proposes an MLP-mixer and a graph convolutional enhanced transformer (MGCET), whose network consists of a spatial-spectral extraction block (SSEB), an MLP-mixer, and a graph convolutional enhanced transformer (GCET). First, spatial-spectral features are extracted using SSEB, and then local spatial-spectral features are fused with global spatial-spectral features by the MLP-mixer. Finally, graph convolution is embedded in multi-head self-attention (MHSA) to mine spatial relationships and similarity between pixels, which further improves the modeling capability of the model. Correlation experiments were conducted on four different HSI datasets. The MGEET algorithm achieved overall accuracies (OAs) of 95.45%, 97.57%, 98.05%, and 98.52% on these datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Virtual Restoration of Ancient Mold-Damaged Painting Based on 3D Convolutional Neural Network for Hyperspectral Image.
- Author
-
Wang, Sa, Cen, Yi, Qu, Liang, Li, Guanghua, Chen, Yao, and Zhang, Lifu
- Subjects
CONVOLUTIONAL neural networks ,STANDARD deviations ,PRESERVATION of painting ,CULTURAL values ,CULTURAL property ,DIGITAL preservation - Abstract
Painted cultural relics hold significant historical value and are crucial in transmitting human culture. However, mold is a common issue for paper or silk-based relics, which not only affects their preservation and longevity but also conceals the texture, patterns, and color information, hindering cultural value and heritage. Currently, the virtual restoration of painting relics primarily involves filling in the RGB based on neighborhood information, which might cause color distortion and other problems. Another approach considers mold as noise and employs maximum noise separation for its removal; however, eliminating the mold components and implementing the inverse transformation often leads to more loss of information. To effectively acquire virtual restoration for mold removal from ancient paintings, the spectral characteristics of mold were analyzed. Based on the spectral features of mold and the cultural relic restoration philosophy of maintaining originality, a 3D CNN artifact restoration network was proposed. This network is capable of learning features in the near-infrared spectrum (NIR) and spatial dimensions to reconstruct the reflectance of visible spectrum, achieving the virtual restoration for mold removal of calligraphic and art relics. Using an ancient painting from the Qing Dynasty as a test subject, the proposed method was compared with the Inpainting, Criminisi, and inverse MNF transformation methods across three regions. Visual analysis, quantitative evaluation (the root mean squared error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MEA), and a classification application were used to assess the restoration accuracy. The visual results and quantitative analyses demonstrated that the proposed 3D CNN method effectively removes or mitigates mold while restoring the artwork to its authentic color in various backgrounds. Furthermore, the color classification results indicated that the images restored with 3D CNN had the highest classification accuracy, with overall accuracies of 89.51%, 92.24%, and 93.63%, and Kappa coefficients of 0.88, 0.91, and 0.93, respectively. This research provides technological support for the digitalization and restoration of cultural artifacts, thereby contributing to the preservation and transmission of cultural heritage. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. TC–Radar: Transformer–CNN Hybrid Network for Millimeter-Wave Radar Object Detection.
- Author
-
Jia, Fengde, Li, Chenyang, Bi, Siyi, Qian, Junhui, Wei, Leizhe, and Sun, Guohao
- Subjects
CONVOLUTIONAL neural networks ,OBJECT recognition (Computer vision) ,TRANSFORMER models ,DATA integration ,NETWORK performance ,INTELLIGENT transportation systems - Abstract
In smart transportation, assisted driving relies on data integration from various sensors, notably LiDAR and cameras. However, their optical performance can degrade under adverse weather conditions, potentially compromising vehicle safety. Millimeter-wave radar, which can overcome these issues more economically, has been re-evaluated. Despite this, developing an accurate detection model is challenging due to significant noise interference and limited semantic information. To address these practical challenges, this paper presents the TC–Radar model, a novel approach that synergistically integrates the strengths of transformer and the convolutional neural network (CNN) to optimize the sensing potential of millimeter-wave radar in smart transportation systems. The rationale for this integration lies in the complementary nature of CNNs, which are adept at capturing local spatial features, and transformers, which excel at modeling long-range dependencies and global context within data. This hybrid approach allows for a more robust and accurate representation of radar signals, leading to enhanced detection performance. A key innovation of our approach is the introduction of the Cross-Attention (CA) module, which facilitates efficient and dynamic information exchange between the encoder and decoder stages of the network. This CA mechanism ensures that critical features are accurately captured and transferred, thereby significantly improving the overall network performance. In addition, the model contains the dense information fusion block (DIFB) to further enrich the feature representation by integrating different high-frequency local features. This integration process ensures thorough incorporation of key data points. Extensive tests conducted on the CRUW and CARRADA datasets validate the strengths of this method, with the model achieving an average precision (AP) of 83.99% and a mean intersection over union (mIoU) of 45.2%, demonstrating robust radar sensing capabilities. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Fire-Net: Rapid Recognition of Forest Fires in UAV Remote Sensing Imagery Using Embedded Devices.
- Author
-
Li, Shouliang, Han, Jiale, Chen, Fanghui, Min, Rudong, Yi, Sixue, and Yang, Zhen
- Subjects
FOREST fires ,CONVOLUTIONAL neural networks ,FOREST monitoring ,DRONE aircraft ,WILDFIRES - Abstract
Forest fires pose a catastrophic threat to Earth's ecology as well as threaten human beings. Timely and accurate monitoring of forest fires can significantly reduce potential casualties and property damage. Thus, to address the aforementioned problems, this paper proposed an unmanned aerial vehicle (UAV) based on a lightweight forest fire recognition model, Fire-Net, which has a multi-stage structure and incorporates cross-channel attention following the fifth stage. This is to enable the model's ability to perceive features at various scales, particularly small-scale fire sources in wild forest scenes. Through training and testing on a real-world dataset, various lightweight convolutional neural networks were evaluated on embedded devices. The experimental outcomes indicate that Fire-Net attained an accuracy of 98.18%, a precision of 99.14%, and a recall of 98.01%, surpassing the current leading methods. Furthermore, the model showcases an average inference time of 10 milliseconds per image and operates at 86 frames per second (FPS) on embedded devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Real-Time Wildfire Monitoring Using Low-Altitude Remote Sensing Imagery.
- Author
-
Tong, Hongwei, Yuan, Jianye, Zhang, Jingjing, Wang, Haofei, and Li, Teng
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,DRONE aircraft ,SUMMER ,REMOTE sensing ,FIRE detectors - Abstract
With rising global temperatures, wildfires frequently occur worldwide during the summer season. The timely detection of these fires, based on unmanned aerial vehicle (UAV) images, can significantly reduce the damage they cause. Existing Convolutional Neural Network (CNN)-based fire detection methods usually use multiple convolutional layers to enhance the receptive fields, but this compromises real-time performance. This paper proposes a novel real-time semantic segmentation network called FireFormer, combining the strengths of CNNs and Transformers to detect fires. An agile ResNet18 as the encoding component tailored to fulfill the efficient fire segmentation is adopted here, and a Forest Fire Transformer Block (FFTB) rooted in the Transformer architecture is proposed as the decoding mechanism. Additionally, to accurately detect and segment small fire spots, we have developed a novel Feature Refinement Network (FRN) to enhance fire segmentation accuracy. The experimental results demonstrate that our proposed FireFormer achieves state-of-the-art performance on the publicly available forest fire dataset FLAME—specifically, with an impressive 73.13% IoU and 84.48% F1 Score. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. MBT-UNet: Multi-Branch Transform Combined with UNet for Semantic Segmentation of Remote Sensing Images.
- Author
-
Liu, Bin, Li, Bing, Sreeram, Victor, and Li, Shuofeng
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,REMOTE sensing ,ENVIRONMENTAL monitoring ,RESOURCE management - Abstract
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. A CatBoost-Based Model for the Intensity Detection of Tropical Cyclones over the Western North Pacific Based on Satellite Cloud Images.
- Author
-
Zhong, Wei, Zhang, Deyuan, Sun, Yuan, and Wang, Qian
- Subjects
TROPICAL cyclones ,REMOTE-sensing images ,CONVOLUTIONAL neural networks ,STANDARD deviations ,BRIGHTNESS temperature - Abstract
A CatBoost-based intelligent tropical cyclone (TC) intensity-detecting model was built to quantify the intensity of TCs over the Western North Pacific (WNP) with the cloud-top brightness temperature (CTBT) data of Fengyun-2F (FY-2F) and Fengyun-2G (FY-2G) and the best-track data of the China Meteorological Administration (CMA-BST) in recent years (2015–2018). The CatBoost-based model was featured with the greedy strategy of combination, the ordering principle in optimizing the possible gradient bias and prediction shift problems, and the oblivious tree in fast scoring. Compared with the previous studies based on the pure convolutional neural network (CNN) models, the CatBoost-based model exhibited better skills in detecting the TC intensity with the root mean square error (RMSE) of 3.74 m s
−1 . In addition to the three mentioned model features, there are also two reasons for the model's design. On one hand, the CatBoost-based model used the method of introducing prior physical factors (e.g., the structure and shape of the cloud, deep convections, and background fields) into its training process. On the other hand, the CatBoost-based model expanded the dataset size from 2342 to 13,471 samples through hourly interpolations of the original dataset. Furthermore, this paper investigated the errors of this model in detecting the different categories of TC intensity. The results showed that the deep learning-based TC intensity-detecting model proposed in this paper has systematic biases, namely, the overestimation (underestimation) of intensities in TCs which were weaker (stronger) than at the typhoon level, and the errors of the model in detecting weaker (stronger) TCs were smaller (larger). This implies that more factors than the CTBT should be included to further reduce the errors in detecting strong TCs. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
41. Spatiotemporal Prediction of Ionospheric Total Electron Content Based on ED-ConvLSTM.
- Author
-
Li, Liangchao, Liu, Haijun, Le, Huijun, Yuan, Jing, Shan, Weifeng, Han, Ying, Yuan, Guoming, Cui, Chunjie, and Wang, Junling
- Subjects
RECURRENT neural networks ,GLOBAL Positioning System ,CONVOLUTIONAL neural networks ,MAGNETIC storms ,DEEP learning ,PREDICTION models - Abstract
Total electron content (TEC) is a vital parameter for describing the state of the ionosphere, and precise prediction of TEC is of great significance for improving the accuracy of the Global Navigation Satellite System (GNSS). At present, most deep learning prediction models just consider TEC temporal variation, while ignoring the impact of spatial location. In this paper, we propose a TEC prediction model, ED-ConvLSTM, which combines convolutional neural networks with recurrent neural networks to simultaneously consider spatiotemporal features. Our ED-ConvLSTM model is built based on the encoder-decoder architecture, which includes two modules: encoder module and decoder module. Each module is composed of ConvLSTM cells. The encoder module is used to extract the spatiotemporal features from TEC maps, while the decoder module converts spatiotemporal features into predicted TEC maps. We compared the predictive performance of our model with two traditional time series models: LSTM, GRU, a spatiotemporal mode1 ConvGRU, and the TEC daily forecast product C1PG provided by CODE on a total of 135 grid points in East Asia (10°–45°N, 90°–130°E). The experimental results show that the prediction error indicators MAE, RMSE, MAPE, and prediction similarity index SSIM of our model are superior to those of the comparison models in high, normal, and low solar activity years. The paper also analyzed the predictive performance of each model monthly. The experimental results indicate that the predictive performance of each model is influenced by the monthly mean of TEC. The ED-ConvLSTM model proposed in this paper is the least affected and the most stable by the monthly mean of TEC. Additionally, the paper compared the predictive performance of each model during two magnetic storm periods when TEC changes sharply. The results indicate that our ED-ConvLSTM model is least affected during magnetic storms and its predictive performance is superior to those of the comparative models. This paper provides a more stable and high-performance TEC spatiotemporal prediction model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Cross-Hole GPR for Soil Moisture Estimation Using Deep Learning.
- Author
-
Pongrac, Blaž, Gleich, Dušan, Malajner, Marko, and Sarjaš, Andrej
- Subjects
SOIL moisture ,DEEP learning ,SOIL moisture measurement ,TRANSMITTING antennas ,CONVOLUTIONAL neural networks ,ANTENNAS (Electronics) - Abstract
This paper presents the design of a high-voltage pulse-based radar and a supervised data processing method for soil moisture estimation. The goal of this research was to design a pulse-based radar to detect changes in soil moisture using a cross-hole approach. The pulse-based radar with three transmitting antennas was placed into a 12 m deep hole, and a receiver with three receive antennas was placed into a different hole separated by 100 m from the transmitter. The pulse generator was based on a Marx generator with an LC filter, and for the receiver, the high-frequency data acquisition card was used, which can acquire signals using 3 Gigabytes per second. Used borehole antennas were designed to operate in the wide frequency band to ensure signal propagation through the soil. A deep regression convolutional network is proposed in this paper to estimate volumetric soil moisture using time-sampled signals. A regression convolutional network is extended to three dimensions to model changes in wave propagation between the transmitted and received signals. The training dataset was acquired during the period of 73 days of acquisition between two boreholes separated by 100 m. The soil moisture measurements were acquired at three points 25 m apart to provide ground truth data. Additionally, water was poured into several specially prepared boreholes between transmitter and receiver antennas to acquire additional dataset for training, validation, and testing of convolutional neural networks. Experimental results showed that the proposed system is able to detect changes in the volumetric soil moisture using Tx and Rx antennas. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Convolutional Neural Network-Based Method for Agriculture Plot Segmentation in Remote Sensing Images.
- Author
-
Qi, Liang, Zuo, Danfeng, Wang, Yirong, Tao, Ye, Tang, Runkang, Shi, Jiayu, Gong, Jiajun, and Li, Bangyu
- Subjects
IMAGE segmentation ,REMOTE sensing ,REMOTE-sensing images ,LAND use ,FEATURE extraction ,AGRICULTURAL productivity - Abstract
Accurate delineation of individual agricultural plots, the foundational units for agriculture-based activities, is crucial for effective government oversight of agricultural productivity and land utilization. To improve the accuracy of plot segmentation in high-resolution remote sensing images, the paper collects GF-2 satellite remote sensing images, uses ArcGIS10.3.1 software to establish datasets, and builds UNet, SegNet, DeeplabV3+, and TransUNet neural network frameworks, respectively, for experimental analysis. Then, the TransUNet network with the best segmentation effects is optimized in both the residual module and the skip connection to further improve its performance for plot segmentation in high-resolution remote sensing images. This article introduces Deformable ConvNets in the residual module to improve the original ResNet50 feature extraction network and combines the convolutional block attention module (CBAM) at the skip connection to calculate and improve the skip connection steps. Experimental results indicate that the optimized remote sensing plot segmentation algorithm based on the TransUNet network achieves an Accuracy of 86.02%, a Recall of 83.32%, an F1-score of 84.67%, and an Intersection over Union (IOU) of 86.90%. Compared to the original TransUNet network for remote sensing land parcel segmentation, whose F1-S is 81.94% and whose IoU is 69.41%, the optimized TransUNet network has significantly improved the performance of remote sensing land parcel segmentation, which verifies the effectiveness and reliability of the plot segmentation algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Target Detection Method for High-Frequency Surface Wave Radar RD Spectrum Based on (VI)CFAR-CNN and Dual-Detection Maps Fusion Compensation.
- Author
-
Ji, Yuanzheng, Liu, Aijun, Chen, Xuekun, Wang, Jiaqi, and Yu, Changjun
- Subjects
CONVOLUTIONAL neural networks ,TRACKING algorithms ,AUTOMATIC identification - Abstract
This paper proposes a method for the intelligent detection of high-frequency surface wave radar (HFSWR) targets. This method cascades the adaptive constant false alarm (CFAR) detector variability index (VI) with the convolutional neural network (CNN) to form a cascade detector (VI)CFAR-CNN. First, the (VI)CFAR algorithm is used for the first-level detection of the range–Doppler (RD) spectrum; based on this result, the two-dimensional window slice data are extracted using the window with the position of the target on the RD spectrum as the center, and input into the CNN model to carry out further target and clutter identification. When the detection rate of the detector reaches a certain level and cannot be further improved due to the convergence of the CNN model, this paper uses a dual-detection maps fusion method to compensate for the loss of detection performance. First, the optimized parameters are used to perform the weighted fusion of the dual-detection maps, and then, the connected components in the fused detection map are further processed to achieve an independent (VI)CFAR to compensate for the (VI)CFAR-CNN detection results. Due to the difficulty in obtaining HFSWR data that include comprehensive and accurate target truth values, this paper adopts a method of embedding targets into the measured background to construct the RD spectrum dataset for HFSWR. At the same time, the proposed method is compared with various other methods to demonstrate its superiority. Additionally, a small amount of automatic identification system (AIS) and radar correlation data are used to verify the effectiveness and feasibility of this method on completely measured HFSWR data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. TransHSI: A Hybrid CNN-Transformer Method for Disjoint Sample-Based Hyperspectral Image Classification.
- Author
-
Zhang, Ping, Yu, Haiyang, Li, Pengao, and Wang, Ruili
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,TRANSFORMER models ,CLASSIFICATION algorithms ,MULTISENSOR data fusion ,FEATURE extraction - Abstract
Hyperspectral images' (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these studies primarily incorporated Transformer blocks at the end of their network architectures. Due to significant differences between the spectral and spatial features in HSIs, the extraction of both global and local spectral–spatial features remains incomplete. To address this challenge, this paper introduces a novel method called TransHSI. This method incorporates a new spectral–spatial feature extraction module that leverages 3D CNNs to fuse Transformer to extract the local and global spectral features of HSIs, then combining 2D CNNs and Transformer to capture the local and global spatial features of HSIs comprehensively. Furthermore, a fusion module is proposed, which not only integrates the learned shallow and deep features of HSIs but also applies a semantic tokenizer to transform the fused features, enhancing the discriminative power of the features. This paper conducts experiments on three public datasets: Indian Pines, Pavia University, and Data Fusion Contest 2018. The training and test sets are selected based on a disjoint sampling strategy. We perform a comparative analysis with 11 traditional and advanced HSI classification algorithms. The experimental results demonstrate that the proposed method, TransHSI algorithm, achieves the highest overall accuracies and kappa coefficients, indicating a competitive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. A Comprehensive Survey on SAR ATR in Deep-Learning Era.
- Author
-
Li, Jianwei, Yu, Zhentao, Yu, Lu, Cheng, Pu, Chen, Jie, and Chi, Cheng
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,SUPERVISED learning ,GENERATIVE adversarial networks ,AUTOMATIC target recognition ,DATA augmentation - Abstract
Due to the advantages of Synthetic Aperture Radar (SAR), the study of Automatic Target Recognition (ATR) has become a hot topic. Deep learning, especially in the case of a Convolutional Neural Network (CNN), works in an end-to-end way and has powerful feature-extracting abilities. Thus, researchers in SAR ATR also seek solutions from deep learning. We review the related algorithms with regard to SAR ATR in this paper. We firstly introduce the commonly used datasets and the evaluation metrics. Then, we introduce the algorithms before deep learning. They are template-matching-, machine-learning- and model-based methods. After that, we introduce mainly the SAR ATR methods in the deep-learning era (after 2017); those methods are the core of the paper. The non-CNNs and CNNs, that is, those used in SAR ATR, are summarized at the beginning. We found that researchers tend to design specialized CNN for SAR ATR. Then, the methods to solve the problem raised by limited samples are reviewed. They are data augmentation, Generative Adversarial Networks (GAN), electromagnetic simulation, transfer learning, few-shot learning, semi-supervised learning, metric leaning and domain knowledge. After that, the imbalance problem, real-time recognition, polarimetric SAR, complex data and adversarial attack are also reviewed. The principles and problems of them are also introduced. Finally, the future directions are conducted. In this part, we point out that the dataset, CNN architecture designing, knowledge-driven, real-time recognition, explainable and adversarial attack should be considered in the future. This paper gives readers a quick overview of the current state of the field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. DMAU-Net: An Attention-Based Multiscale Max-Pooling Dense Network for the Semantic Segmentation in VHR Remote-Sensing Images.
- Author
-
Yang, Yang, Dong, Junwu, Wang, Yanhui, Yu, Bibo, and Yang, Zhigang
- Subjects
REMOTE-sensing images ,CONVOLUTIONAL neural networks ,REMOTE sensing ,IMAGE recognition (Computer vision) ,IMAGE segmentation ,FEATURE extraction - Abstract
High-resolution remote-sensing images cover more feature information, including texture, structure, shape, and other geometric details, while the relationships among target features are more complex. These factors make it more complicated for classical convolutional neural networks to obtain ideal results when performing a feature classification on remote-sensing images. To address this issue, we proposed an attention-based multiscale max-pooling dense network (DMAU-Net), which is based on U-Net for ground object classification. The network is designed with an integrated max-pooling module that incorporates dense connections in the encoder part to enhance the quality of the feature map, and thus improve the feature-extraction capability of the network. Equally, in the decoding, we introduce the Efficient Channel Attention (ECA) module, which can strengthen the effective features and suppress the irrelevant information. To validate the ground object classification performance of the multi-pooling integration network proposed in this paper, we conducted experiments on the Vaihingen and Potsdam datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). We compared DMAU-Net with other mainstream semantic segmentation models. The experimental results show that the DMAU-Net proposed in this paper effectively improves the accuracy of the feature classification of high-resolution remote-sensing images. The feature boundaries obtained by DMAU-Net are clear and regionally complete, enhancing the ability to optimize the edges of features. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Acoustic Impedance Inversion from Seismic Imaging Profiles Using Self Attention U-Net.
- Author
-
Tao, Liurong, Ren, Haoran, and Gu, Zhiwei
- Subjects
ACOUSTIC impedance ,IMAGING systems in seismology ,CONVOLUTIONAL neural networks ,INVERSION (Geophysics) ,INVERSE problems ,DEEP learning ,NONLINEAR equations - Abstract
Seismic impedance inversion is a vital way of geological interpretation and reservoir investigation from a geophysical perspective. However, it is inevitably an ill-posed problem due to the noise or the band-limited characteristic of seismic data. Artificial neural network have been used to solve nonlinear inverse problems in recent years. This research obtained an acoustic impedance profile by feeding seismic profile and background impedance into a well-trained self-attention U-Net. The U-Net got convergence by appropriate iteration, and the output predicted the impedance profiles in the test. To value the quality of predicted profiles from different perspectives, e.g., correlation, regression, and similarity, we used four kinds of indexes. At the same time, our results were predicted by conventional methods (e.g., deconvolution with recursive inversion, and TV regularization) and a 1D neural network was calculated in contrast. Self-attention U-Net showed to be robust to noise and does not require prior knowledge. Furthermore, spatial continuity is also better than deconvolution, regularization, and 1D deep learning methods in contrast. The U-Net in this paper is a type of full convolutional neural network, so there are no limits to the shape of the input. Based on this, a large impedance profile can be predicted by U-Net, which is trained by a patchy training dataset. In addition, this paper applied the proposed method to the field data obtained by the Ceduna survey without any label. The predictions prove that this well-trained network could be generalized from synthetic data to field data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. A Comparative Study of Different CNN Models and Transfer Learning Effect for Underwater Object Classification in Side-Scan Sonar Images.
- Author
-
Du, Xing, Sun, Yongfu, Song, Yupeng, Sun, Huifeng, and Yang, Lei
- Subjects
SONAR imaging ,DEEP learning ,CONVOLUTIONAL neural networks ,AUTOMATIC target recognition ,IMAGE recognition (Computer vision) ,SONAR - Abstract
With the development of deep learning techniques, convolutional neural networks (CNN) are increasingly being used in image recognition for marine surveys and underwater object classification. Automatic recognition of targets on side-scan sonar (SSS) images using CNN can improve recognition accuracy and efficiency. However, the vast selection of CNN models makes it challenging to select models for target recognition in SSS images. Therefore, this paper aims to compare different CNN models' prediction accuracy and computational performance comprehensively. First, four traditional CNN models were applied to train and predict the same submarine SSS dataset using both the original model and models with transfer learning methods. Then, we examined and studied the prediction accuracy and computation performance of four CNN models. Results showed that transfer learning enhances the accuracy of all CNN models, with lesser improvements for AlexNet and VGG-16 and greater improvements for GoogleNet and ResNet101. GoogleNet has the highest prediction of accuracy (100% in the train dataset and 94.27% in the test dataset) and good computational difficulty. The findings of this work are useful for future model selection in target recognition in SSS images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Radar Emitter Recognition Based on Spiking Neural Networks.
- Author
-
Luo, Zhenghao, Wang, Xingdong, Yuan, Shuo, and Liu, Zhangmeng
- Subjects
ARTIFICIAL neural networks ,RADAR signal processing ,MILITARY electronics ,CONVOLUTIONAL neural networks ,ELECTRONIC measurements - Abstract
Efficient and effective radar emitter recognition is critical for electronic support measurement (ESM) systems. However, in complex electromagnetic environments, intercepted pulse trains generally contain substantial data noise, including spurious and missing pulses. Currently, radar emitter recognition methods utilizing traditional artificial neural networks (ANNs) like CNNs and RNNs are susceptible to data noise and require intensive computations, posing challenges to meeting the performance demands of modern ESM systems. Spiking neural networks (SNNs) exhibit stronger representational capabilities compared to traditional ANNs due to the temporal dynamics of spiking neurons and richer information encoded in precise spike timing. Furthermore, SNNs achieve higher computational efficiency by performing event-driven sparse addition calculations. In this paper, a lightweight spiking neural network is proposed by combining direct coding, leaky integrate-and-fire (LIF) neurons, and surrogate gradients to recognize radar emitters. Additionally, an improved SNN for radar emitter recognition is proposed, leveraging the local timing structure of pulses to enhance adaptability to data noise. Simulation results demonstrate the superior performance of the proposed method over existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.