1,435 results
Search Results
2. A Method for Underwater Acoustic Target Recognition Based on the Delay-Doppler Joint Feature.
- Author
-
Du, Libin, Wang, Zhengkai, Lv, Zhichao, Han, Dongyue, Wang, Lei, Yu, Fei, and Lan, Qing
- Subjects
CONVOLUTIONAL neural networks ,ARCHITECTURAL acoustics ,OBJECT recognition (Computer vision) ,FOURIER transforms - Abstract
With the aim of solving the problem of identifying complex underwater acoustic targets using a single signal feature in the Time–Frequency (TF) feature, this paper designs a method that recognizes the underwater targets based on the Delay-Doppler joint feature. First, this method uses symplectic finite Fourier transform (SFFT) to extract the Delay-Doppler features of underwater acoustic signals, analyzes the Time–Frequency features at the same time, and combines the Delay-Doppler (DD) feature and Time–Frequency feature to form a joint feature (TF-DD). This paper uses three types of convolutional neural networks to verify that TF-DD can effectively improve the accuracy of target recognition. Secondly, this paper designs an object recognition model (TF-DD-CNN) based on joint features as input, which simplifies the neural network's overall structure and improves the model's training efficiency. This research employs ship-radiated noise to validate the efficacy of TF-DD-CNN for target identification. The results demonstrate that the combined characteristic and the TF-DD-CNN model introduced in this study can proficiently detect ships, and the model notably enhances the precision of detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. On-Board Multi-Class Geospatial Object Detection Based on Convolutional Neural Network for High Resolution Remote Sensing Images.
- Author
-
Shen, Yanyun, Liu, Di, Chen, Junyi, Wang, Zhipan, Wang, Zhe, and Zhang, Qingling
- Subjects
OBJECT recognition (Computer vision) ,CONVOLUTIONAL neural networks ,REMOTE-sensing images ,REMOTE sensing ,DATA transmission systems ,URBAN planning ,OPTICAL remote sensing - Abstract
Multi-class geospatial object detection in high-resolution remote sensing images has significant potential in various domains such as industrial production, military warning, disaster monitoring, and urban planning. However, the traditional process of remote sensing object detection involves several time-consuming steps, including image acquisition, image download, ground processing, and object detection. These steps may not be suitable for tasks with shorter timeliness requirements, such as military warning and disaster monitoring. Additionally, the transmission of massive data from satellites to the ground is limited by bandwidth, resulting in time delays and redundant information, such as cloud coverage images. To address these challenges and achieve efficient utilization of information, this paper proposes a comprehensive on-board multi-class geospatial object detection scheme. The proposed scheme consists of several steps. Firstly, the satellite imagery is sliced, and the PID-Net (Proportional-Integral-Derivative Network) method is employed to detect and filter out cloud-covered tiles. Subsequently, our Manhattan Intersection over Union (MIOU) loss-based YOLO (You Only Look Once) v7-Tiny method is used to detect remote-sensing objects in the remaining tiles. Finally, the detection results are mapped back to the original image, and the truncated NMS (Non-Maximum Suppression) method is utilized to filter out repeated and noisy boxes. To validate the reliability of the scheme, this paper creates a new dataset called DOTA-CD (Dataset for Object Detection in Aerial Images-Cloud Detection). Experiments were conducted on both ground and on-board equipment using the AIR-CD dataset, DOTA dataset, and DOTA-CD dataset. The results demonstrate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Remote Sensing for Maritime Monitoring and Vessel Identification.
- Author
-
Salerno, Emanuele, Di Paola, Claudio, and Lo Duca, Angelica
- Subjects
DEEP learning ,REMOTE sensing ,CONVOLUTIONAL neural networks ,SURVEILLANCE radar ,SYNTHETIC aperture radar ,INFORMATION technology ,PATTERN recognition systems - Abstract
This document explores the significance of remote sensing in monitoring maritime activities and identifying vessels. It emphasizes the need for surveillance to ensure safety, security, and emergency management, given the increasing number of vessels worldwide. The document highlights the use of technologies like the Automatic Identification System (AIS) and remote sensing in situations where collaborative systems are not reliable. It also discusses the integration of data from different sensors and the application of data science techniques for a comprehensive assessment of maritime traffic. The document concludes by summarizing research papers on ship detection, tracking, and classification using various sensors and data processing techniques. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
5. Deep-Learning-Based Daytime COT Retrieval and Prediction Method Using FY4A AGRI Data.
- Author
-
Xu, Fanming, Song, Biao, Chen, Jianhua, Guan, Runda, Zhu, Rongjie, Liu, Jiayu, and Qiu, Zhongfeng
- Subjects
CONVOLUTIONAL neural networks ,PREDICTION models ,DEEP learning ,FORECASTING - Abstract
The traditional method for retrieving cloud optical thickness (COT) is carried out through a Look-Up Table (LUT). Researchers must make a series of idealized assumptions and conduct extensive observations and record features in this scenario, consuming considerable resources. The emergence of deep learning effectively addresses the shortcomings of the traditional approach. In this paper, we first propose a daytime (SOZA < 70°) COT retrieval algorithm based on FY-4A AGRI. We establish and train a Convolutional Neural Network (CNN) model for COT retrieval, CM4CR, with the CALIPSO's COT product spatially and temporally synchronized as the ground truth. Then, a deep learning method extended from video prediction models is adopted to predict COT values based on the retrieval results obtained from CM4CR. The COT prediction model (CPM) consists of an encoder, a predictor, and a decoder. On this basis, we further incorporated a time embedding module to enhance the model's ability to learn from irregular time intervals in the input COT sequence. During the training phase, we employed Charbonnier Loss and Edge Loss to enhance the model's capability to represent COT details. Experiments indicate that our CM4CR outperforms existing COT retrieval methods, with predictions showing better performance across several metrics than other benchmark prediction models. Additionally, this paper also investigates the impact of different lengths of COT input sequences and the time intervals between adjacent frames of COT on prediction performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Transfer Learning-Based Specific Emitter Identification for ADS-B over Satellite System.
- Author
-
Liu, Mingqian, Chai, Yae, Li, Ming, Wang, Jiakun, and Zhao, Nan
- Subjects
CONVOLUTIONAL neural networks ,LOW earth orbit satellites ,AUTOMATIC dependent surveillance-broadcast ,HUMAN fingerprints ,FEATURE extraction ,DISTRIBUTED sensors - Abstract
In future aviation surveillance, the demand for higher real-time updates for global flights can be met by deploying automatic dependent surveillance–broadcast (ADS-B) receivers on low Earth orbit satellites, capitalizing on their global coverage and terrain-independent capabilities for seamless monitoring. Specific emitter identification (SEI) leverages the distinctive features of ADS-B data. High data collection and annotation costs, along with limited dataset size, can lead to overfitting during training and low model recognition accuracy. Transfer learning, which does not require source and target domain data to share the same distribution, significantly reduces the sensitivity of traditional models to data volume and distribution. It can also address issues related to the incompleteness and inadequacy of communication emitter datasets. This paper proposes a distributed sensor system based on transfer learning to address the specific emitter identification. Firstly, signal fingerprint features are extracted using a bispectrum transform (BST) to train a convolutional neural network (CNN) preliminarily. Decision fusion is employed to tackle the challenges of the distributed system. Subsequently, a transfer learning strategy is employed, incorporating frozen model parameters, maximum mean discrepancy (MMD), and classification error measures to reduce the disparity between the target and source domains. A hyperbolic space module is introduced before the output layer to enhance the expressive capacity and data information extraction. After iterative training, the transfer learning model is obtained. Simulation results confirm that this method enhances model generalization, addresses the issue of slow convergence, and leads to improved training accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Combining "Deep Learning" and Physically Constrained Neural Networks to Derive Complex Glaciological Change Processes from Modern High-Resolution Satellite Imagery: Application of the GEOCLASS-Image System to Create VarioCNN for Glacier Surges.
- Author
-
Herzfeld, Ute C., Hessburg, Lawrence J., Trantow, Thomas M., and Hayes, Adam N.
- Subjects
REMOTE-sensing images ,CONVOLUTIONAL neural networks ,DEEP learning ,GLACIERS ,IMAGE recognition (Computer vision) ,ACCELERATION (Mechanics) - Abstract
The objectives of this paper are to investigate the trade-offs between a physically constrained neural network and a deep, convolutional neural network and to design a combined ML approach ("VarioCNN"). Our solution is provided in the framework of a cyberinfrastructure that includes a newly designed ML software, GEOCLASS-image (v1.0), modern high-resolution satellite image data sets (Maxar WorldView data), and instructions/descriptions that may facilitate solving similar spatial classification problems. Combining the advantages of the physically-driven connectionist-geostatistical classification method with those of an efficient CNN, VarioCNN provides a means for rapid and efficient extraction of complex geophysical information from submeter resolution satellite imagery. A retraining loop overcomes the difficulties of creating a labeled training data set. Computational analyses and developments are centered on a specific, but generalizable, geophysical problem: The classification of crevasse types that form during the surge of a glacier system. A surge is a glacial catastrophe, an acceleration of a glacier to typically 100–200 times its normal velocity. GEOCLASS-image is applied to study the current (2016-2024) surge in the Negribreen Glacier System, Svalbard. The geophysical result is a description of the structural evolution and expansion of the surge, based on crevasse types that capture ice deformation in six simplified classes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Vulnerable Road User Skeletal Pose Estimation Using mmWave Radars.
- Author
-
Zeng, Zhiyuan, Liang, Xingdong, Li, Yanlei, and Dang, Xiangwei
- Subjects
ROAD users ,TRACKING radar ,RADAR targets ,CONVOLUTIONAL neural networks ,RADAR signal processing ,DATA augmentation - Abstract
A skeletal pose estimation method, named RVRU-Pose, is proposed to estimate the skeletal pose of vulnerable road users based on distributed non-coherent mmWave radar. In view of the limitation that existing methods for skeletal pose estimation are only applicable to small scenes, this paper proposes a strategy that combines radar intensity heatmaps and coordinate heatmaps as input to a deep learning network. In addition, we design a multi-resolution data augmentation and training method suitable for radar to achieve target pose estimation for remote and multi-target application scenarios. Experimental results show that RVRU-Pose can achieve better than 2 cm average localization accuracy for different subjects in different scenarios, which is superior in terms of accuracy and time compared to existing state-of-the-art methods for human skeletal pose estimation with radar. As an essential performance parameter of radar, the impact of angular resolution on the estimation accuracy of a skeletal pose is quantitatively analyzed and evaluated in this paper. Finally, RVRU-Pose has also been extended to the task of estimating the skeletal pose of a cyclist, reflecting the strong scalability of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Locating and Grading of Lidar-Observed Aircraft Wake Vortex Based on Convolutional Neural Networks.
- Author
-
Zhang, Xinyu, Zhang, Hongwei, Wang, Qichao, Liu, Xiaoying, Liu, Shouxin, Zhang, Rongchuan, Li, Rongzhong, and Wu, Songhua
- Subjects
CONVOLUTIONAL neural networks ,DOPPLER lidar ,AERONAUTICAL safety measures - Abstract
Aircraft wake vortices are serious threats to aviation safety. The Pulsed Coherent Doppler Lidar (PCDL) has been widely used in the observation of aircraft wake vortices due to its advantages of high spatial-temporal resolution and high precision. However, the post-processing algorithms require significant computing resources, which cannot achieve the real-time detection of a wake vortex (WV). This paper presents an improved Convolutional Neural Network (CNN) method for WV locating and grading based on PCDL data to avoid the influence of unstable ambient wind fields on the localization and classification results of WV. Typical WV cases are selected for analysis, and the WV locating and grading models are validated on different test sets. The consistency of the analytical algorithm and the CNN algorithm is verified. The results indicate that the improved CNN method achieves satisfactory recognition accuracy with higher efficiency and better robustness, especially in the case of strong turbulence, where the CNN method recognizes the wake vortex while the analytical method cannot. The improved CNN method is expected to be applied to optimize the current aircraft spacing criteria, which is promising in terms of aviation safety and economic benefit improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer.
- Author
-
Wang, Minhui, Sun, Yaxiu, Xiang, Jianhong, Sun, Rui, and Zhong, Yu
- Subjects
TRANSFORMER models ,CONVOLUTIONAL neural networks ,LIDAR ,DIGITAL elevation models ,TRANSFER matrix ,DATA fusion (Statistics) - Abstract
Utilizing multi-modal data, as opposed to only hyperspectral image (HSI), enhances target identification accuracy in remote sensing. Transformers are applied to multi-modal data classification for their long-range dependency but often overlook intrinsic image structure by directly flattening image blocks into vectors. Moreover, as the encoder deepens, unprofitable information negatively impacts classification performance. Therefore, this paper proposes a learnable transformer with an adaptive gating mechanism (AGMLT). Firstly, a spectral–spatial adaptive gating mechanism (SSAGM) is designed to comprehensively extract the local information from images. It mainly contains point depthwise attention (PDWA) and asymmetric depthwise attention (ADWA). The former is for extracting spectral information of HSI, and the latter is for extracting spatial information of HSI and elevation information of LiDAR-derived rasterized digital surface models (LiDAR-DSM). By omitting linear layers, local continuity is maintained. Then, the layer Scale and learnable transition matrix are introduced to the original transformer encoder and self-attention to form the learnable transformer (L-Former). It improves data dynamics and prevents performance degradation as the encoder deepens. Subsequently, learnable cross-attention (LC-Attention) with the learnable transfer matrix is designed to augment the fusion of multi-modal data by enriching feature information. Finally, poly loss, known for its adaptability with multi-modal data, is employed in training the model. Experiments in the paper are conducted on four famous multi-modal datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and Houston2013 (HU). The results show that AGMLT achieves optimal performance over some existing models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. CroplandCDNet: Cropland Change Detection Network for Multitemporal Remote Sensing Images Based on Multilayer Feature Transmission Fusion of an Adaptive Receptive Field.
- Author
-
Wu, Qiang, Huang, Liang, Tang, Bo-Hui, Cheng, Jiapei, Wang, Meiqi, and Zhang, Zixuan
- Subjects
CONVOLUTIONAL neural networks ,CHANGE-point problems ,FARMS ,MARKOV random fields ,REMOTE-sensing images ,FEATURE extraction - Abstract
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the deep features of cropland, the accumulation of irrelevant features and the loss of key features will lead to poor detection results. To effectively solve this problem, a novel cropland change detection network (CroplandCDNet) is proposed in this paper; this network combines an adaptive receptive field and multiscale feature transmission fusion to achieve accurate detection of cropland change information. CroplandCDNet first effectively extracts the multiscale features of cropland from bitemporal remote sensing images through the feature extraction module and subsequently embeds the receptive field adaptive SK attention (SKA) module to emphasize cropland change. Moreover, the SKA module effectively uses spatial context information for the dynamic adjustment of the convolution kernel size of cropland features at different scales. Finally, multiscale features and difference features are transmitted and fused layer by layer to obtain the content of cropland change. In the experiments, the proposed method is compared with six advanced change detection methods using the cropland change detection dataset (CLCD). The experimental results show that CroplandCDNet achieves the best F1 and OA at 76.04% and 94.47%, respectively. Its precision and recall are second best of all models at 76.46% and 75.63%, respectively. Moreover, a generalization experiment was carried out using the Jilin-1 dataset, which effectively verified the reliability of CroplandCDNet in cropland change detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Hardware-Aware Design of Speed-Up Algorithms for Synthetic Aperture Radar Ship Target Detection Networks.
- Author
-
Zhang, Yue, Jiang, Shuai, Cao, Yue, Xiao, Jiarong, Li, Chengkun, Zhou, Xuan, and Yu, Zhongjun
- Subjects
SYNTHETIC aperture radar ,RADAR targets ,SYNTHETIC apertures ,CONVOLUTIONAL neural networks ,SUCCESSIVE approximation analog-to-digital converters ,NAVAL architecture ,ALGORITHMS - Abstract
Recently, synthetic aperture radar (SAR) target detection algorithms based on Convolutional Neural Networks (CNN) have received increasing attention. However, the large amount of computation required burdens the real-time detection of SAR ship targets on resource-limited and power-constrained satellite-based platforms. In this paper, we propose a hardware-aware model speed-up method for single-stage SAR ship targets detection tasks, oriented towards the most widely used hardware for neural network computing—Graphic Processing Unit (GPU). We first analyze the process by which the task of detection is executed on GPUs and propose two strategies according to this process. Firstly, in order to speed up the execution of the model on a GPU, we propose SAR-aware model quantification to allow the original model to be stored and computed in a low-precision format. Next, to ensure the loss of accuracy is negligible after the acceleration and compression process, precision-aware scheduling is used to filter out layers that are not suitable for quantification and store and execute them in a high-precision mode. Trained on the dataset HRSID, the effectiveness of this model speed-up algorithm was demonstrated by compressing four different sizes of models (yolov5n, yolov5s, yolov5m, yolov5l). The experimental results show that the detection speeds of yolov5n, yolov5s, yolov5m, and yolov5l can reach 234.7785 fps, 212.8341 fps, 165.6523 fps, and 139.8758 fps on the NVIDIA AGX Xavier development board with negligible loss of accuracy, which is 1.230 times, 1.469 times, 1.955 times, and 2.448 times faster than the original before the use of this method, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Remote Sensing Crop Recognition by Coupling Phenological Features and Off-Center Bayesian Deep Learning.
- Author
-
Wu, Yongchuang, Wu, Penghai, Wu, Yanlan, Yang, Hui, and Wang, Biao
- Subjects
REMOTE sensing ,DEEP learning ,RECURRENT neural networks ,CONVOLUTIONAL neural networks ,AREA measurement - Abstract
Obtaining accurate and timely crop area information is crucial for crop yield estimates and food security. Because most existing crop mapping models based on remote sensing data have poor generalizability, they cannot be rapidly deployed for crop identification tasks in different regions. Based on a priori knowledge of phenology, we designed an off-center Bayesian deep learning remote sensing crop classification method that can highlight phenological features, combined with an attention mechanism and residual connectivity. In this paper, we first optimize the input image and input features based on a phenology analysis. Then, a convolutional neural network (CNN), recurrent neural network (RNN), and random forest classifier (RFC) were built based on farm data in northeastern Inner Mongolia and applied to perform comparisons with the method proposed here. Then, classification tests were performed on soybean, maize, and rice from four measurement areas in northeastern China to verify the accuracy of the above methods. To further explore the reliability of the method proposed in this paper, an uncertainty analysis was conducted by Bayesian deep learning to analyze the model's learning process and model structure for interpretability. Finally, statistical data collected in Suibin County, Heilongjiang Province, over many years, and Shandong Province in 2020 were used as reference data to verify the applicability of the methods. The experimental results show that the classification accuracy of the three crops reached 90.73% overall and the average F1 and IOU were 89.57% and 81.48%, respectively. Furthermore, the proposed method can be directly applied to crop area estimations in different years in other regions based on its good correlation with official statistics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Deep Learning for Earthquake Disaster Assessment: Objects, Data, Models, Stages, Challenges, and Opportunities.
- Author
-
Jia, Jing and Ye, Wenjie
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,EARTHQUAKES ,GENERATIVE adversarial networks ,RECURRENT neural networks ,IMAGE recognition (Computer vision) - Abstract
Earthquake Disaster Assessment (EDA) plays a critical role in earthquake disaster prevention, evacuation, and rescue efforts. Deep learning (DL), which boasts advantages in image processing, signal recognition, and object detection, has facilitated scientific research in EDA. This paper analyses 204 articles through a systematic literature review to investigate the status quo, development, and challenges of DL for EDA. The paper first examines the distribution characteristics and trends of the two categories of EDA assessment objects, including earthquakes and secondary disasters as disaster objects, buildings, infrastructure, and areas as physical objects. Next, this study analyses the application distribution, advantages, and disadvantages of the three types of data (remote sensing data, seismic data, and social media data) mainly involved in these studies. Furthermore, the review identifies the characteristics and application of six commonly used DL models in EDA, including convolutional neural network (CNN), multi-layer perceptron (MLP), recurrent neural network (RNN), generative adversarial network (GAN), transfer learning (TL), and hybrid models. The paper also systematically details the application of DL for EDA at different times (i.e., pre-earthquake stage, during-earthquake stage, post-earthquake stage, and multi-stage). We find that the most extensive research in this field involves using CNNs for image classification to detect and assess building damage resulting from earthquakes. Finally, the paper discusses challenges related to training data and DL models, and identifies opportunities in new data sources, multimodal DL, and new concepts. This review provides valuable references for scholars and practitioners in related fields. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Hyperspectral Image Classification via Spatial Shuffle-Based Convolutional Neural Network.
- Author
-
Wang, Zhihui, Cao, Baisong, and Liu, Jun
- Subjects
CONVOLUTIONAL neural networks ,IMAGE recognition (Computer vision) ,SPECTRAL imaging - Abstract
The unique spatial–spectral integration characteristics of hyperspectral imagery (HSI) make it widely applicable in many fields. The spatial–spectral feature fusion-based HSI classification has always been a research hotspot. Typically, classification methods based on spatial–spectral features will select larger neighborhood windows to extract more spatial features for classification. However, this approach can also lead to the problem of non-independent training and testing sets to a certain extent. This paper proposes a spatial shuffle strategy that selects a smaller neighborhood window and randomly shuffles the pixels within the window. This strategy simulates the potential patterns of the pixel distribution in the real world as much as possible. Then, the samples of a three-dimensional HSI cube is transformed into two-dimensional images. Training with a simple CNN model that is not optimized for architecture can still achieve very high classification accuracy, indicating that the proposed method of this paper has considerable performance-improvement potential. The experimental results also indicate that the smaller neighborhood windows can achieve the same, or even better, classification performance compared to larger neighborhood windows. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. A CatBoost-Based Model for the Intensity Detection of Tropical Cyclones over the Western North Pacific Based on Satellite Cloud Images.
- Author
-
Zhong, Wei, Zhang, Deyuan, Sun, Yuan, and Wang, Qian
- Subjects
TROPICAL cyclones ,REMOTE-sensing images ,CONVOLUTIONAL neural networks ,STANDARD deviations ,BRIGHTNESS temperature - Abstract
A CatBoost-based intelligent tropical cyclone (TC) intensity-detecting model was built to quantify the intensity of TCs over the Western North Pacific (WNP) with the cloud-top brightness temperature (CTBT) data of Fengyun-2F (FY-2F) and Fengyun-2G (FY-2G) and the best-track data of the China Meteorological Administration (CMA-BST) in recent years (2015–2018). The CatBoost-based model was featured with the greedy strategy of combination, the ordering principle in optimizing the possible gradient bias and prediction shift problems, and the oblivious tree in fast scoring. Compared with the previous studies based on the pure convolutional neural network (CNN) models, the CatBoost-based model exhibited better skills in detecting the TC intensity with the root mean square error (RMSE) of 3.74 m s
−1 . In addition to the three mentioned model features, there are also two reasons for the model's design. On one hand, the CatBoost-based model used the method of introducing prior physical factors (e.g., the structure and shape of the cloud, deep convections, and background fields) into its training process. On the other hand, the CatBoost-based model expanded the dataset size from 2342 to 13,471 samples through hourly interpolations of the original dataset. Furthermore, this paper investigated the errors of this model in detecting the different categories of TC intensity. The results showed that the deep learning-based TC intensity-detecting model proposed in this paper has systematic biases, namely, the overestimation (underestimation) of intensities in TCs which were weaker (stronger) than at the typhoon level, and the errors of the model in detecting weaker (stronger) TCs were smaller (larger). This implies that more factors than the CTBT should be included to further reduce the errors in detecting strong TCs. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
17. Spatiotemporal Prediction of Ionospheric Total Electron Content Based on ED-ConvLSTM.
- Author
-
Li, Liangchao, Liu, Haijun, Le, Huijun, Yuan, Jing, Shan, Weifeng, Han, Ying, Yuan, Guoming, Cui, Chunjie, and Wang, Junling
- Subjects
RECURRENT neural networks ,GLOBAL Positioning System ,CONVOLUTIONAL neural networks ,MAGNETIC storms ,DEEP learning ,PREDICTION models - Abstract
Total electron content (TEC) is a vital parameter for describing the state of the ionosphere, and precise prediction of TEC is of great significance for improving the accuracy of the Global Navigation Satellite System (GNSS). At present, most deep learning prediction models just consider TEC temporal variation, while ignoring the impact of spatial location. In this paper, we propose a TEC prediction model, ED-ConvLSTM, which combines convolutional neural networks with recurrent neural networks to simultaneously consider spatiotemporal features. Our ED-ConvLSTM model is built based on the encoder-decoder architecture, which includes two modules: encoder module and decoder module. Each module is composed of ConvLSTM cells. The encoder module is used to extract the spatiotemporal features from TEC maps, while the decoder module converts spatiotemporal features into predicted TEC maps. We compared the predictive performance of our model with two traditional time series models: LSTM, GRU, a spatiotemporal mode1 ConvGRU, and the TEC daily forecast product C1PG provided by CODE on a total of 135 grid points in East Asia (10°–45°N, 90°–130°E). The experimental results show that the prediction error indicators MAE, RMSE, MAPE, and prediction similarity index SSIM of our model are superior to those of the comparison models in high, normal, and low solar activity years. The paper also analyzed the predictive performance of each model monthly. The experimental results indicate that the predictive performance of each model is influenced by the monthly mean of TEC. The ED-ConvLSTM model proposed in this paper is the least affected and the most stable by the monthly mean of TEC. Additionally, the paper compared the predictive performance of each model during two magnetic storm periods when TEC changes sharply. The results indicate that our ED-ConvLSTM model is least affected during magnetic storms and its predictive performance is superior to those of the comparative models. This paper provides a more stable and high-performance TEC spatiotemporal prediction model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. LDnADMM-Net: A Denoising Unfolded Deep Neural Network for Direction-of-Arrival Estimations in A Low Signal-to-Noise Ratio.
- Author
-
Liang, Can, Liu, Mingxuan, Li, Yang, Wang, Yanhua, and Hu, Xueyao
- Subjects
DIRECTION of arrival estimation ,CONVOLUTIONAL neural networks ,SIGNAL-to-noise ratio ,COMPRESSED sensing ,SIGNAL denoising - Abstract
In this paper, we explore the problem of direction-of-arrival (DOA) estimation for a non-uniform linear array (NULA) under strong noise. The compressed sensing (CS)-based methods are widely used in NULA DOA estimations. However, these methods commonly rely on the tuning of parameters, which are hard to fine-tune. Additionally, these methods lack robustness under strong noise. To address these issues, this paper proposes a novel DOA estimation approach using a deep neural network (DNN) for a NULA in a low SNR. The proposed network is designed based on the denoising convolutional neural network (DnCNN) and the alternating direction method of multipliers (ADMM), which is dubbed as LDnADMM-Net. First, we construct an unfolded DNN architecture that mimics the behavior of the iterative processing of an ADMM. In this way, the parameters of an ADMM can be transformed into the network weights, and thus we can adaptively optimize these parameters through network training. Then, we employ the DnCNN to develop a denoising module (DnM) and integrate it into the unfolded DNN. Using this DnM, we can enhance the anti-noise ability of the proposed network and obtain a robust DOA estimation in a low SNR. The simulation and experimental results show that the proposed LDnADMM-Net can obtain high-accuracy and super-resolution DOA estimations for a NULA with strong robustness in a low signal-to-noise ratio (SNR). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Convolutional Neural Network-Based Method for Agriculture Plot Segmentation in Remote Sensing Images.
- Author
-
Qi, Liang, Zuo, Danfeng, Wang, Yirong, Tao, Ye, Tang, Runkang, Shi, Jiayu, Gong, Jiajun, and Li, Bangyu
- Subjects
IMAGE segmentation ,REMOTE sensing ,REMOTE-sensing images ,LAND use ,FEATURE extraction ,AGRICULTURAL productivity - Abstract
Accurate delineation of individual agricultural plots, the foundational units for agriculture-based activities, is crucial for effective government oversight of agricultural productivity and land utilization. To improve the accuracy of plot segmentation in high-resolution remote sensing images, the paper collects GF-2 satellite remote sensing images, uses ArcGIS10.3.1 software to establish datasets, and builds UNet, SegNet, DeeplabV3+, and TransUNet neural network frameworks, respectively, for experimental analysis. Then, the TransUNet network with the best segmentation effects is optimized in both the residual module and the skip connection to further improve its performance for plot segmentation in high-resolution remote sensing images. This article introduces Deformable ConvNets in the residual module to improve the original ResNet50 feature extraction network and combines the convolutional block attention module (CBAM) at the skip connection to calculate and improve the skip connection steps. Experimental results indicate that the optimized remote sensing plot segmentation algorithm based on the TransUNet network achieves an Accuracy of 86.02%, a Recall of 83.32%, an F1-score of 84.67%, and an Intersection over Union (IOU) of 86.90%. Compared to the original TransUNet network for remote sensing land parcel segmentation, whose F1-S is 81.94% and whose IoU is 69.41%, the optimized TransUNet network has significantly improved the performance of remote sensing land parcel segmentation, which verifies the effectiveness and reliability of the plot segmentation algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Target Detection Method for High-Frequency Surface Wave Radar RD Spectrum Based on (VI)CFAR-CNN and Dual-Detection Maps Fusion Compensation.
- Author
-
Ji, Yuanzheng, Liu, Aijun, Chen, Xuekun, Wang, Jiaqi, and Yu, Changjun
- Subjects
CONVOLUTIONAL neural networks ,TRACKING algorithms ,AUTOMATIC identification - Abstract
This paper proposes a method for the intelligent detection of high-frequency surface wave radar (HFSWR) targets. This method cascades the adaptive constant false alarm (CFAR) detector variability index (VI) with the convolutional neural network (CNN) to form a cascade detector (VI)CFAR-CNN. First, the (VI)CFAR algorithm is used for the first-level detection of the range–Doppler (RD) spectrum; based on this result, the two-dimensional window slice data are extracted using the window with the position of the target on the RD spectrum as the center, and input into the CNN model to carry out further target and clutter identification. When the detection rate of the detector reaches a certain level and cannot be further improved due to the convergence of the CNN model, this paper uses a dual-detection maps fusion method to compensate for the loss of detection performance. First, the optimized parameters are used to perform the weighted fusion of the dual-detection maps, and then, the connected components in the fused detection map are further processed to achieve an independent (VI)CFAR to compensate for the (VI)CFAR-CNN detection results. Due to the difficulty in obtaining HFSWR data that include comprehensive and accurate target truth values, this paper adopts a method of embedding targets into the measured background to construct the RD spectrum dataset for HFSWR. At the same time, the proposed method is compared with various other methods to demonstrate its superiority. Additionally, a small amount of automatic identification system (AIS) and radar correlation data are used to verify the effectiveness and feasibility of this method on completely measured HFSWR data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Fire-Net: Rapid Recognition of Forest Fires in UAV Remote Sensing Imagery Using Embedded Devices.
- Author
-
Li, Shouliang, Han, Jiale, Chen, Fanghui, Min, Rudong, Yi, Sixue, and Yang, Zhen
- Subjects
FOREST fires ,CONVOLUTIONAL neural networks ,FOREST monitoring ,DRONE aircraft ,WILDFIRES - Abstract
Forest fires pose a catastrophic threat to Earth's ecology as well as threaten human beings. Timely and accurate monitoring of forest fires can significantly reduce potential casualties and property damage. Thus, to address the aforementioned problems, this paper proposed an unmanned aerial vehicle (UAV) based on a lightweight forest fire recognition model, Fire-Net, which has a multi-stage structure and incorporates cross-channel attention following the fifth stage. This is to enable the model's ability to perceive features at various scales, particularly small-scale fire sources in wild forest scenes. Through training and testing on a real-world dataset, various lightweight convolutional neural networks were evaluated on embedded devices. The experimental outcomes indicate that Fire-Net attained an accuracy of 98.18%, a precision of 99.14%, and a recall of 98.01%, surpassing the current leading methods. Furthermore, the model showcases an average inference time of 10 milliseconds per image and operates at 86 frames per second (FPS) on embedded devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Real-Time Wildfire Monitoring Using Low-Altitude Remote Sensing Imagery.
- Author
-
Tong, Hongwei, Yuan, Jianye, Zhang, Jingjing, Wang, Haofei, and Li, Teng
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,DRONE aircraft ,SUMMER ,REMOTE sensing ,FIRE detectors - Abstract
With rising global temperatures, wildfires frequently occur worldwide during the summer season. The timely detection of these fires, based on unmanned aerial vehicle (UAV) images, can significantly reduce the damage they cause. Existing Convolutional Neural Network (CNN)-based fire detection methods usually use multiple convolutional layers to enhance the receptive fields, but this compromises real-time performance. This paper proposes a novel real-time semantic segmentation network called FireFormer, combining the strengths of CNNs and Transformers to detect fires. An agile ResNet18 as the encoding component tailored to fulfill the efficient fire segmentation is adopted here, and a Forest Fire Transformer Block (FFTB) rooted in the Transformer architecture is proposed as the decoding mechanism. Additionally, to accurately detect and segment small fire spots, we have developed a novel Feature Refinement Network (FRN) to enhance fire segmentation accuracy. The experimental results demonstrate that our proposed FireFormer achieves state-of-the-art performance on the publicly available forest fire dataset FLAME—specifically, with an impressive 73.13% IoU and 84.48% F1 Score. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. MBT-UNet: Multi-Branch Transform Combined with UNet for Semantic Segmentation of Remote Sensing Images.
- Author
-
Liu, Bin, Li, Bing, Sreeram, Victor, and Li, Shuofeng
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,REMOTE sensing ,ENVIRONMENTAL monitoring ,RESOURCE management - Abstract
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Radar Emitter Recognition Based on Spiking Neural Networks.
- Author
-
Luo, Zhenghao, Wang, Xingdong, Yuan, Shuo, and Liu, Zhangmeng
- Subjects
ARTIFICIAL neural networks ,RADAR signal processing ,MILITARY electronics ,CONVOLUTIONAL neural networks ,ELECTRONIC measurements - Abstract
Efficient and effective radar emitter recognition is critical for electronic support measurement (ESM) systems. However, in complex electromagnetic environments, intercepted pulse trains generally contain substantial data noise, including spurious and missing pulses. Currently, radar emitter recognition methods utilizing traditional artificial neural networks (ANNs) like CNNs and RNNs are susceptible to data noise and require intensive computations, posing challenges to meeting the performance demands of modern ESM systems. Spiking neural networks (SNNs) exhibit stronger representational capabilities compared to traditional ANNs due to the temporal dynamics of spiking neurons and richer information encoded in precise spike timing. Furthermore, SNNs achieve higher computational efficiency by performing event-driven sparse addition calculations. In this paper, a lightweight spiking neural network is proposed by combining direct coding, leaky integrate-and-fire (LIF) neurons, and surrogate gradients to recognize radar emitters. Additionally, an improved SNN for radar emitter recognition is proposed, leveraging the local timing structure of pulses to enhance adaptability to data noise. Simulation results demonstrate the superior performance of the proposed method over existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation.
- Author
-
Ding, Hao, Xia, Bo, Liu, Weilin, Zhang, Zekai, Zhang, Jinglin, Wang, Xing, and Xu, Sen
- Subjects
CONVOLUTIONAL neural networks ,REMOTE sensing ,TRANSFORMER models ,COMPUTATIONAL complexity ,EARTHQUAKES - Abstract
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Intrapulse Modulation Radar Signal Recognition Using CNN with Second-Order STFT-Based Synchrosqueezing Transform.
- Author
-
Dong, Ning, Jiang, Hong, Liu, Yipeng, and Zhang, Jingtao
- Subjects
CONVOLUTIONAL neural networks ,SIGNAL classification ,FOURIER transforms ,SIGNAL-to-noise ratio ,RADAR ,PHOTOPLETHYSMOGRAPHY - Abstract
Intrapulse modulation classification of radar signals plays an important role in modern electronic reconnaissance, countermeasures, etc. In this paper, to improve the recognition rate at low signal-to-noise ratio (SNR), we propose a recognition method using the second-order short-time Fourier transform (STFT)-based synchrosqueezing transform (FSST2) combined with a modified convolution neural network, which we name MeNet. In particular, the radar signals are first preprocessed via the time–frequency analysis and STFT-based FSST2. Then, the informative features of the time–frequency images (TFIs) are deeply learned and classified through the MeNet with several specific convolutional blocks. The simulation results show that the overall recognition rate for seven types of intrapulse modulation radar signals can reach 95.6%, even when the SNR is −12 dB. Compared with other networks, the excellent recognition rate proves the superiority of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. BAFormer: A Novel Boundary-Aware Compensation UNet-like Transformer for High-Resolution Cropland Extraction.
- Author
-
Li, Zhiyong, Wang, Youming, Tian, Fa, Zhang, Junbo, Chen, Yijie, and Li, Kunhong
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,DEEP learning ,REMOTE sensing ,FARMS - Abstract
Utilizing deep learning for semantic segmentation of cropland from remote sensing imagery has become a crucial technique in land surveys. Cropland is highly heterogeneous and fragmented, and existing methods often suffer from inaccurate boundary segmentation. This paper introduces a UNet-like boundary-aware compensation model (BAFormer). Cropland boundaries typically exhibit rapid transformations in pixel values and texture features, often appearing as high-frequency features in remote sensing images. To enhance the recognition of these high-frequency features as represented by cropland boundaries, the proposed BAFormer integrates a Feature Adaptive Mixer (FAM) and develops a Depthwise Large Kernel Multi-Layer Perceptron model (DWLK-MLP) to enrich the global and local cropland boundaries features separately. Specifically, FAM enhances the boundary-aware method by adaptively acquiring high-frequency features through convolution and self-attention advantages, while DWLK-MLP further supplements boundary position information using a large receptive field. The efficacy of BAFormer has been evaluated on datasets including Vaihingen, Potsdam, LoveDA, and Mapcup. It demonstrates high performance, achieving mIoU scores of 84.5%, 87.3%, 53.5%, and 83.1% on these datasets, respectively. Notably, BAFormer-T (lightweight model) surpasses other lightweight models on the Vaihingen dataset with scores of 91.3% F1 and 84.1% mIoU. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Graph Neural Networks in Point Clouds: A Survey.
- Author
-
Li, Dilong, Lu, Chenghui, Chen, Ziyi, Guan, Jianlong, Zhao, Jing, and Du, Jixiang
- Subjects
GRAPH neural networks ,CONVOLUTIONAL neural networks ,NATURAL language processing ,OBJECT recognition (Computer vision) ,TRANSFORMER models - Abstract
With the advancement of 3D sensing technologies, point clouds are gradually becoming the main type of data representation in applications such as autonomous driving, robotics, and augmented reality. Nevertheless, the irregularity inherent in point clouds presents numerous challenges for traditional deep learning frameworks. Graph neural networks (GNNs) have demonstrated their tremendous potential in processing graph-structured data and are widely applied in various domains including social media data analysis, molecular structure calculation, and computer vision. GNNs, with their capability to handle non-Euclidean data, offer a novel approach for addressing these challenges. Additionally, drawing inspiration from the achievements of transformers in natural language processing, graph transformers have propelled models towards global awareness, overcoming the limitations of local aggregation mechanisms inherent in early GNN architectures. This paper provides a comprehensive review of GNNs and graph-based methods in point cloud applications, adopting a task-oriented perspective to analyze this field. We categorize GNN methods for point clouds based on fundamental tasks, such as segmentation, classification, object detection, registration, and other related tasks. For each category, we summarize the existing mainstream methods, conduct a comprehensive analysis of their performance on various datasets, and discuss the development trends and future prospects of graph-based methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province.
- Author
-
Yang, Rui, Qi, Yuan, Zhang, Hui, Wang, Hongwei, Zhang, Jinlong, Ma, Xiaofang, Zhang, Juan, and Ma, Chao
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,REMOTE sensing ,CROPS ,STANDARD deviations ,IMAGE segmentation ,CROP quality ,PRECISION farming - Abstract
The timely and accurate acquisition of information on the distribution of the crop planting structure in the Loess Plateau of eastern Gansu Province, one of the most important agricultural areas in Western China, is crucial for promoting fine management of agriculture and ensuring food security. This study uses multi-temporal high-resolution remote sensing images to determine optimal segmentation scales for various crops, employing the estimation of scale parameter 2 (ESP2) tool and the Ratio of Mean Absolute Deviation to Standard Deviation (RMAS) model. The Canny edge detection algorithm is then applied for multi-scale image segmentation. By incorporating crop phenological factors and using the L1-regularized logistic regression model, we optimized 39 spatial feature factors—including spectral, textural, geometric, and index features. Within a multi-level classification framework, the Random Forest (RF) classifier and Convolutional Neural Network (CNN) model are used to classify the cropping patterns in four test areas based on the multi-scale segmented images. The results indicate that integrating the Canny edge detection algorithm with the optimal segmentation scales calculated using the ESP2 tool and RMAS model produces crop parcels with more complete boundaries and better separability. Additionally, optimizing spatial features using the L1-regularized logistic regression model, combined with phenological information, enhances classification accuracy. Within the OBIC framework, the RF classifier achieves higher accuracy in classifying cropping patterns. The overall classification accuracies for the four test areas are 91.93%, 94.92%, 89.37%, and 90.68%, respectively. This paper introduced crop phenological factors, effectively improving the extraction precision of the shattered agricultural planting structure in the Loess Plateau of eastern Gansu Province. Its findings have important application value in crop monitoring, management, food security and other related fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Application of Deep Learning for Segmenting Seepages in Levee Systems.
- Author
-
Panta, Manisha, Thapa, Padam Jung, Hoque, Md Tamjidul, Niles, Kendall N., Sloan, Steve, Flanagin, Maik, Pathak, Ken, and Abdelguerfi, Mahdi
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,LEVEES - Abstract
Seepage is a typical hydraulic factor that can initiate the breaching process in a levee system. If not identified and treated on time, seepages can be a severe problem for levees, weakening the levee structure and eventually leading to collapse. Therefore, it is essential always to be vigilant with regular monitoring procedures to identify seepages throughout these levee systems and perform adequate repairs to limit potential threats from unforeseen levee failures. This paper introduces a fully convolutional neural network to identify and segment seepage from the image in levee systems. To the best of our knowledge, this is the first work in this domain. Applying deep learning techniques for semantic segmentation tasks in real-world scenarios has its own challenges, especially the difficulty for models to effectively learn from complex backgrounds while focusing on simpler objects of interest. This challenge is particularly evident in the task of detecting seepages in levee systems, where the fault is relatively simple compared to the complex and varied background. We addressed this problem by introducing negative images and a controlled transfer learning approach for semantic segmentation for accurate seepage segmentation in levee systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Deep Learning-Based Detection of Oil Spills in Pakistan's Exclusive Economic Zone from January 2017 to December 2023.
- Author
-
Basit, Abdul, Siddique, Muhammad Adnan, Bashir, Salman, Naseer, Ehtasham, and Sarfraz, Muhammad Saquib
- Subjects
CONVOLUTIONAL neural networks ,OIL spills ,OIL seepage ,ALGAL blooms ,TOXIC algae ,MARINE accidents ,INSPECTION & review - Abstract
Oil spillages on a sea's or an ocean's surface are a threat to marine and coastal ecosystems. They are mainly caused by ship accidents, illegal discharge of oil from ships during cleaning and oil seepage from natural reservoirs. Synthetic-Aperture Radar (SAR) has proved to be a useful tool for analyzing oil spills, because it operates in all-day, all-weather conditions. An oil spill can typically be seen as a dark stretch in SAR images and can often be detected through visual inspection. The major challenge is to differentiate oil spills from look-alikes, i.e., low-wind areas, algae blooms and grease ice, etc., that have a dark signature similar to that of an oil spill. It has been noted over time that oil spill events in Pakistan's territorial waters often remain undetected until the oil reaches the coastal regions or it is located by concerned authorities during patrolling. A formal remote sensing-based operational framework for oil spills detection in Pakistan's Exclusive Economic Zone (EEZ) in the Arabian Sea is urgently needed. In this paper, we report the use of an encoder–decoder-based convolutional neural network trained on an annotated dataset comprising selected oil spill events verified by the European Maritime Safety Agency (EMSA). The dataset encompasses multiple classes, viz., sea surface, oil spill, look-alikes, ships and land. We processed Sentinel-1 acquisitions over the EEZ from January 2017 to December 2023, and we thereby prepared a repository of SAR images for the aforementioned duration. This repository contained images that had been vetted by SAR experts, to trace and confirm oil spills. We tested the repository using the trained model, and, to our surprise, we detected 92 previously unreported oil spill events within those seven years. In 2020, our model detected 26 oil spills in the EEZ, which corresponds to the highest number of spills detected in a single year; whereas in 2023, our model detected 10 oil spill events. In terms of the total surface area covered by the spills, the worst year was 2021, with a cumulative 395 sq. km covered in oil or an oil-like substance. On the whole, these are alarming figures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Global-Local Collaborative Learning Network for Optical Remote Sensing Image Change Detection.
- Author
-
Li, Jinghui, Shao, Feng, Liu, Qiang, and Meng, Xiangchao
- Subjects
OPTICAL remote sensing ,COLLABORATIVE learning ,CONVOLUTIONAL neural networks ,TRANSFORMER models ,REMOTE sensing ,ARTIFICIAL satellites - Abstract
Due to the widespread applications of change detection technology in urban change analysis, environmental monitoring, agricultural surveillance, disaster detection, and other domains, the task of change detection has become one of the primary applications of Earth orbit satellite remote sensing data. However, the analysis of dual-temporal change detection (CD) remains a challenge in high-resolution optical remote sensing images due to the complexities in remote sensing images, such as intricate textures, seasonal variations in imaging time, climatic differences, and significant differences in the sizes of various objects. In this paper, we propose a novel U-shaped architecture for change detection. In the encoding stage, a multi-branch feature extraction module is employed by combining CNN and transformer networks to enhance the network's perception capability for objects of varying sizes. Furthermore, a multi-branch aggregation module is utilized to aggregate features from different branches, providing the network with global attention while preserving detailed information. For dual-temporal features, we introduce a spatiotemporal discrepancy perception module to model the context of dual-temporal images. Particularly noteworthy is the construction of channel attention and token attention modules based on the transformer attention mechanism to facilitate information interaction between multi-level features, thereby enhancing the network's contextual awareness. The effectiveness of the proposed network is validated on three public datasets, demonstrating its superior performance over other state-of-the-art methods through qualitative and quantitative experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Lightweight Pedestrian Detection Network for UAV Remote Sensing Images Based on Strideless Pooling.
- Author
-
Liu, Sanzai, Cao, Lihua, and Li, Yi
- Subjects
OBJECT recognition (Computer vision) ,PEDESTRIANS ,TRAFFIC monitoring ,CONVOLUTIONAL neural networks ,EMERGENCY management - Abstract
The need for pedestrian target detection in uncrewed aerial vehicle (UAV) remote sensing images has become increasingly significant as the technology continues to evolve. UAVs equipped with high-resolution cameras can capture detailed imagery of various scenarios, making them ideal for monitoring and surveillance applications. Pedestrian detection is particularly crucial in scenarios such as traffic monitoring, security surveillance, and disaster response, where the safety and well-being of individuals are paramount. However, pedestrian detection in UAV remote sensing images poses several challenges. Firstly, the small size of pedestrians relative to the overall image, especially at higher altitudes, makes them difficult to detect. Secondly, the varying backgrounds and lighting conditions in remote sensing images can further complicate the task of detection. Traditional object detection methods often struggle to handle these complexities, resulting in decreased detection accuracy and increased false positives. Addressing the aforementioned concerns, this paper proposes a lightweight object detection model that integrates GhostNet and YOLOv5s. Building upon this foundation, we further introduce the SPD-Conv module to the model. With this addition, the aim is to preserve fine-grained features of the images during downsampling, thereby enhancing the model's capability to recognize small-scale objects. Furthermore, the coordinate attention module is introduced to further improve the model's recognition accuracy. In the proposed model, the number of parameters is successfully reduced to 4.77 M, compared with 7.01 M in YOLOv5s, representing a 32% reduction. The mean average precision (mAP) increased from 0.894 to 0.913, reflecting a 1.9% improvement. We have named the proposed model "GSC-YOLO". This study holds significant importance in advancing the lightweighting of UAV target detection models and addressing the challenges associated with complex scene object detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. MFPANet: Multi-Scale Feature Perception and Aggregation Network for High-Resolution Snow Depth Estimation.
- Author
-
Zhao, Liling, Chen, Junyu, Shahzad, Muhammad, Xia, Min, and Lin, Haifeng
- Subjects
SNOW accumulation ,MICROWAVE remote sensing ,SYNTHETIC aperture radar ,REMOTE-sensing images ,DEPTH perception ,REMOTE sensing ,AVALANCHES - Abstract
Accurate snow depth estimation is of significant importance, particularly for preventing avalanche disasters and predicting flood seasons. The predominant approaches for such snow depth estimation, based on deep learning methods, typically rely on passive microwave remote sensing data. However, due to the low resolution of passive microwave remote sensing data, it often results in low-accuracy outcomes, posing considerable limitations in application. To further improve the accuracy of snow depth estimation, in this paper, we used active microwave remote sensing data. We fused multi-spectral optical satellite images, synthetic aperture radar (SAR) images and land cover distribution images to generate a snow remote sensing dataset (SRSD). It is a first-of-its-kind dataset that includes active microwave remote sensing images in high-latitude regions of Asia. Using these novel data, we proposed a multi-scale feature perception and aggregation neural network (MFPANet) that focuses on improving feature extraction from multi-source images. Our systematic analysis reveals that the proposed approach is not only robust but also achieves high accuracy in snow depth estimation compared to existing state-of-the-art methods, with RMSE of 0.360 and with MAE of 0.128. Finally, we selected several representative areas in our study region and applied our method to map snow depth distribution, demonstrating its broad application prospects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Hyperspectral Image Denoising Based on Deep and Total Variation Priors.
- Author
-
Wang, Peng, Sun, Tianman, Chen, Yiming, Ge, Lihua, Wang, Xiaoyi, and Wang, Liguo
- Subjects
DEEP learning ,IMAGE denoising ,CONVOLUTIONAL neural networks ,SPECTRAL imaging ,SPARSE matrices ,STOCHASTIC processes - Abstract
To address the problems of noise interference and image blurring in hyperspectral imaging (HSI), this paper proposes a denoising method for HSI based on deep learning and a total variation (TV) prior. The method minimizes the first-order moment distance between the deep prior of a Fast and Flexible Denoising Convolutional Neural Network (FFDNet) and the Enhanced 3D TV (E3DTV) prior, obtaining dual priors that complement and reinforce each other's advantages. Specifically, the original HSI is initially processed with a random binary sparse observation matrix to achieve a sparse representation. Subsequently, the plug-and-play (PnP) algorithm is employed within the framework of generalized alternating projection (GAP) to denoise the sparsely represented HSI. Experimental results demonstrate that, compared to existing methods, this method shows significant advantages in both quantitative and qualitative assessments, effectively enhancing the quality of HSIs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Improving Artificial-Intelligence-Based Individual Tree Species Classification Using Pseudo Tree Crown Derived from Unmanned Aerial Vehicle Imagery.
- Author
-
Miao, Shengjie, Zhang, Kongwen, Zeng, Hongda, and Liu, Jane
- Subjects
CROWNS (Botany) ,DRONE aircraft ,CONVOLUTIONAL neural networks ,LANDSAT satellites ,URBAN trees ,ARTIFICIAL intelligence - Abstract
Urban tree classification enables informed decision-making processes in urban planning and management. This paper introduces a novel data reformation method, pseudo tree crown (PTC), which enhances the feature difference in the input layer and results in the improvement of the accuracy and efficiency of urban tree classification by utilizing artificial intelligence (AI) techniques. The study involved a comparative analysis of the performance of various machine learning (ML) classifiers. The results revealed a significant enhancement in classification accuracy, with an improvement exceeding 10% observed when high spatial resolution imagery captured by an unmanned aerial vehicle (UAV) was utilized. Furthermore, the study found an impressive average classification accuracy of 93% achieved by a classifier built on the PyTorch framework, with ResNet50 leveraged as its convolutional neural network layer. These findings underscore the potential of AI-driven approaches in advancing urban tree classification methodologies for enhanced urban planning and management practices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Cross-Hole GPR for Soil Moisture Estimation Using Deep Learning.
- Author
-
Pongrac, Blaž, Gleich, Dušan, Malajner, Marko, and Sarjaš, Andrej
- Subjects
SOIL moisture ,DEEP learning ,SOIL moisture measurement ,TRANSMITTING antennas ,CONVOLUTIONAL neural networks ,ANTENNAS (Electronics) - Abstract
This paper presents the design of a high-voltage pulse-based radar and a supervised data processing method for soil moisture estimation. The goal of this research was to design a pulse-based radar to detect changes in soil moisture using a cross-hole approach. The pulse-based radar with three transmitting antennas was placed into a 12 m deep hole, and a receiver with three receive antennas was placed into a different hole separated by 100 m from the transmitter. The pulse generator was based on a Marx generator with an LC filter, and for the receiver, the high-frequency data acquisition card was used, which can acquire signals using 3 Gigabytes per second. Used borehole antennas were designed to operate in the wide frequency band to ensure signal propagation through the soil. A deep regression convolutional network is proposed in this paper to estimate volumetric soil moisture using time-sampled signals. A regression convolutional network is extended to three dimensions to model changes in wave propagation between the transmitted and received signals. The training dataset was acquired during the period of 73 days of acquisition between two boreholes separated by 100 m. The soil moisture measurements were acquired at three points 25 m apart to provide ground truth data. Additionally, water was poured into several specially prepared boreholes between transmitter and receiver antennas to acquire additional dataset for training, validation, and testing of convolutional neural networks. Experimental results showed that the proposed system is able to detect changes in the volumetric soil moisture using Tx and Rx antennas. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. A Comprehensive Survey on SAR ATR in Deep-Learning Era.
- Author
-
Li, Jianwei, Yu, Zhentao, Yu, Lu, Cheng, Pu, Chen, Jie, and Chi, Cheng
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,SUPERVISED learning ,GENERATIVE adversarial networks ,AUTOMATIC target recognition ,DATA augmentation - Abstract
Due to the advantages of Synthetic Aperture Radar (SAR), the study of Automatic Target Recognition (ATR) has become a hot topic. Deep learning, especially in the case of a Convolutional Neural Network (CNN), works in an end-to-end way and has powerful feature-extracting abilities. Thus, researchers in SAR ATR also seek solutions from deep learning. We review the related algorithms with regard to SAR ATR in this paper. We firstly introduce the commonly used datasets and the evaluation metrics. Then, we introduce the algorithms before deep learning. They are template-matching-, machine-learning- and model-based methods. After that, we introduce mainly the SAR ATR methods in the deep-learning era (after 2017); those methods are the core of the paper. The non-CNNs and CNNs, that is, those used in SAR ATR, are summarized at the beginning. We found that researchers tend to design specialized CNN for SAR ATR. Then, the methods to solve the problem raised by limited samples are reviewed. They are data augmentation, Generative Adversarial Networks (GAN), electromagnetic simulation, transfer learning, few-shot learning, semi-supervised learning, metric leaning and domain knowledge. After that, the imbalance problem, real-time recognition, polarimetric SAR, complex data and adversarial attack are also reviewed. The principles and problems of them are also introduced. Finally, the future directions are conducted. In this part, we point out that the dataset, CNN architecture designing, knowledge-driven, real-time recognition, explainable and adversarial attack should be considered in the future. This paper gives readers a quick overview of the current state of the field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. DMAU-Net: An Attention-Based Multiscale Max-Pooling Dense Network for the Semantic Segmentation in VHR Remote-Sensing Images.
- Author
-
Yang, Yang, Dong, Junwu, Wang, Yanhui, Yu, Bibo, and Yang, Zhigang
- Subjects
REMOTE-sensing images ,CONVOLUTIONAL neural networks ,REMOTE sensing ,IMAGE recognition (Computer vision) ,IMAGE segmentation ,FEATURE extraction - Abstract
High-resolution remote-sensing images cover more feature information, including texture, structure, shape, and other geometric details, while the relationships among target features are more complex. These factors make it more complicated for classical convolutional neural networks to obtain ideal results when performing a feature classification on remote-sensing images. To address this issue, we proposed an attention-based multiscale max-pooling dense network (DMAU-Net), which is based on U-Net for ground object classification. The network is designed with an integrated max-pooling module that incorporates dense connections in the encoder part to enhance the quality of the feature map, and thus improve the feature-extraction capability of the network. Equally, in the decoding, we introduce the Efficient Channel Attention (ECA) module, which can strengthen the effective features and suppress the irrelevant information. To validate the ground object classification performance of the multi-pooling integration network proposed in this paper, we conducted experiments on the Vaihingen and Potsdam datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). We compared DMAU-Net with other mainstream semantic segmentation models. The experimental results show that the DMAU-Net proposed in this paper effectively improves the accuracy of the feature classification of high-resolution remote-sensing images. The feature boundaries obtained by DMAU-Net are clear and regionally complete, enhancing the ability to optimize the edges of features. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. TransHSI: A Hybrid CNN-Transformer Method for Disjoint Sample-Based Hyperspectral Image Classification.
- Author
-
Zhang, Ping, Yu, Haiyang, Li, Pengao, and Wang, Ruili
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,TRANSFORMER models ,CLASSIFICATION algorithms ,MULTISENSOR data fusion ,FEATURE extraction - Abstract
Hyperspectral images' (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these studies primarily incorporated Transformer blocks at the end of their network architectures. Due to significant differences between the spectral and spatial features in HSIs, the extraction of both global and local spectral–spatial features remains incomplete. To address this challenge, this paper introduces a novel method called TransHSI. This method incorporates a new spectral–spatial feature extraction module that leverages 3D CNNs to fuse Transformer to extract the local and global spectral features of HSIs, then combining 2D CNNs and Transformer to capture the local and global spatial features of HSIs comprehensively. Furthermore, a fusion module is proposed, which not only integrates the learned shallow and deep features of HSIs but also applies a semantic tokenizer to transform the fused features, enhancing the discriminative power of the features. This paper conducts experiments on three public datasets: Indian Pines, Pavia University, and Data Fusion Contest 2018. The training and test sets are selected based on a disjoint sampling strategy. We perform a comparative analysis with 11 traditional and advanced HSI classification algorithms. The experimental results demonstrate that the proposed method, TransHSI algorithm, achieves the highest overall accuracies and kappa coefficients, indicating a competitive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Self-Supervised Convolutional Neural Network Learning in a Hybrid Approach Framework to Estimate Chlorophyll and Nitrogen Content of Maize from Hyperspectral Images.
- Author
-
Gallo, Ignazio, Boschetti, Mirco, Rehman, Anwar Ur, and Candiani, Gabriele
- Subjects
CONVOLUTIONAL neural networks ,BLENDED learning ,MACHINE learning ,SUPERVISED learning ,CHLOROPHYLL - Abstract
The new generation of available (i.e., PRISMA, ENMAP, DESIS) and future (i.e., ESA-CHIME, NASA-SBG) spaceborne hyperspectral missions provide unprecedented data for environmental and agricultural monitoring, such as crop trait assessment. This paper focuses on retrieving two crop traits, specifically Chlorophyll and Nitrogen content at the canopy level (CCC and CNC), starting from hyperspectral images acquired during the CHIME-RCS project, exploiting a self-supervised learning (SSL) technique. SSL is a machine learning paradigm that leverages unlabeled data to generate valuable representations for downstream tasks, bridging the gap between unsupervised and supervised learning. The proposed method comprises pre-training and fine-tuning procedures: in the first stage, a de-noising Convolutional Autoencoder is trained using pairs of noisy and clean CHIME-like images; the pre-trained Encoder network is utilized as-is or fine-tuned in the second stage. The paper demonstrates the applicability of this technique in hybrid approach methods that combine Radiative Transfer Modelling (RTM) and Machine Learning Regression Algorithm (MLRA) to set up a retrieval schema able to estimate crop traits from new generation space-born hyperspectral data. The results showcase excellent prediction accuracy for estimating CCC (R2 = 0.8318; RMSE = 0.2490) and CNC (R2 = 0.9186; RMSE = 0.7908) for maize crops from CHIME-like images without requiring further ground data calibration. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Vehicle Detection in Multisource Remote Sensing Images Based on Edge-Preserving Super-Resolution Reconstruction.
- Author
-
Zhu, Hong, Lv, Yanan, Meng, Jian, Liu, Yuxuan, Hu, Liuru, Yao, Jiaqi, and Lu, Xionghanxuan
- Subjects
CONVOLUTIONAL neural networks ,OBJECT recognition (Computer vision) ,INTELLIGENT transportation systems ,REMOTE sensing ,ARTIFICIAL satellites ,IMAGE reconstruction ,AUTOMOBILE size ,TRANSPORTATION management system - Abstract
As an essential technology for intelligent transportation management and traffic risk prevention and control, vehicle detection plays a significant role in the comprehensive evaluation of the intelligent transportation system. However, limited by the small size of vehicles in satellite remote sensing images and lack of sufficient texture features, its detection performance is far from satisfactory. In view of the unclear edge structure of small objects in the super-resolution (SR) reconstruction process, deep convolutional neural networks are no longer effective in extracting small-scale feature information. Therefore, a vehicle detection network based on remote sensing images (VDNET-RSI) is constructed in this article. The VDNET-RSI contains a two-stage convolutional neural network for vehicle detection. In the first stage, a partial convolution-based padding adopts the improved Local Implicit Image Function (LIIF) to reconstruct high-resolution remote sensing images. Then, the network associated with the results from the first stage is used in the second stage for vehicle detection. In the second stage, the super-resolution module, detection heads module and convolutional block attention module adopt the increased object detection framework to improve the performance of small object detection in large-scale remote sensing images. The publicly available DIOR dataset is selected as the experimental dataset to compare the performance of VDNET-RSI with that of the state-of-the-art models in vehicle detection based on satellite remote sensing images. The experimental results demonstrated that the overall precision of VDNET-RSI reached 62.9%, about 6.3%, 38.6%, 39.8% higher than that of YOLOv5, Faster-RCNN and FCOS, respectively. The conclusions of this paper can provide a theoretical basis and key technical support for the development of intelligent transportation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Acoustic Impedance Inversion from Seismic Imaging Profiles Using Self Attention U-Net.
- Author
-
Tao, Liurong, Ren, Haoran, and Gu, Zhiwei
- Subjects
ACOUSTIC impedance ,IMAGING systems in seismology ,CONVOLUTIONAL neural networks ,INVERSION (Geophysics) ,INVERSE problems ,DEEP learning ,NONLINEAR equations - Abstract
Seismic impedance inversion is a vital way of geological interpretation and reservoir investigation from a geophysical perspective. However, it is inevitably an ill-posed problem due to the noise or the band-limited characteristic of seismic data. Artificial neural network have been used to solve nonlinear inverse problems in recent years. This research obtained an acoustic impedance profile by feeding seismic profile and background impedance into a well-trained self-attention U-Net. The U-Net got convergence by appropriate iteration, and the output predicted the impedance profiles in the test. To value the quality of predicted profiles from different perspectives, e.g., correlation, regression, and similarity, we used four kinds of indexes. At the same time, our results were predicted by conventional methods (e.g., deconvolution with recursive inversion, and TV regularization) and a 1D neural network was calculated in contrast. Self-attention U-Net showed to be robust to noise and does not require prior knowledge. Furthermore, spatial continuity is also better than deconvolution, regularization, and 1D deep learning methods in contrast. The U-Net in this paper is a type of full convolutional neural network, so there are no limits to the shape of the input. Based on this, a large impedance profile can be predicted by U-Net, which is trained by a patchy training dataset. In addition, this paper applied the proposed method to the field data obtained by the Ceduna survey without any label. The predictions prove that this well-trained network could be generalized from synthetic data to field data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. A Comparative Study of Different CNN Models and Transfer Learning Effect for Underwater Object Classification in Side-Scan Sonar Images.
- Author
-
Du, Xing, Sun, Yongfu, Song, Yupeng, Sun, Huifeng, and Yang, Lei
- Subjects
SONAR imaging ,DEEP learning ,CONVOLUTIONAL neural networks ,AUTOMATIC target recognition ,IMAGE recognition (Computer vision) ,SONAR - Abstract
With the development of deep learning techniques, convolutional neural networks (CNN) are increasingly being used in image recognition for marine surveys and underwater object classification. Automatic recognition of targets on side-scan sonar (SSS) images using CNN can improve recognition accuracy and efficiency. However, the vast selection of CNN models makes it challenging to select models for target recognition in SSS images. Therefore, this paper aims to compare different CNN models' prediction accuracy and computational performance comprehensively. First, four traditional CNN models were applied to train and predict the same submarine SSS dataset using both the original model and models with transfer learning methods. Then, we examined and studied the prediction accuracy and computation performance of four CNN models. Results showed that transfer learning enhances the accuracy of all CNN models, with lesser improvements for AlexNet and VGG-16 and greater improvements for GoogleNet and ResNet101. GoogleNet has the highest prediction of accuracy (100% in the train dataset and 94.27% in the test dataset) and good computational difficulty. The findings of this work are useful for future model selection in target recognition in SSS images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Unsupervised SAR Image Change Detection Based on Histogram Fitting Error Minimization and Convolutional Neural Network.
- Author
-
Zhang, Kaiyu, Lv, Xiaolei, Guo, Bin, and Chai, Huiming
- Subjects
CONVOLUTIONAL neural networks ,SYNTHETIC aperture radar ,DEEP learning ,HISTOGRAMS ,REMOTE sensing - Abstract
Synthetic aperture radar (SAR) image change detection is one of the most important applications in remote sensing. Before performing change detection, the original SAR image is often cropped to extract the region of interest (ROI). However, the size of the ROI often affects the change detection results. Therefore, it is necessary to detect changes using local information. This paper proposes a novel unsupervised change detection framework based on deep learning. The specific method steps are described as follows: First, we use histogram fitting error minimization (HFEM) to perform thresholding for a difference image (DI). Then, the DI is fed into a convolutional neural network (CNN). Therefore, the proposed method is called HFEM-CNN. We test three different CNN architectures called Unet, PSPNet and the designed fully convolutional neural network (FCNN) for the framework. The overall loss function is a weighted average of pixel loss and neighborhood loss. The weight between pixel loss and neighborhood loss is determined by the manually set parameter λ. Compared to other recently proposed methods, HFEM-CNN does not need a fragment removal procedure as post-processing. This paper conducts experiments for water and building change detection on three datasets. The experiments are divided into two parts: whole data experiments and random cropped data experiments. The complete experiments prove that the performance of the method in this paper is close to other methods on complete datasets. The random cropped data experiment is to perform local change detection using patches cropped from the whole datasets. The proposed method is slightly better than traditional methods in the whole data experiments. In experiments with randomly cropped data, the average kappa coefficient of our method on 63 patches is over 3.16% compared to other methods. Experiments also show that the proposed method is suitable for local change detection and robust to randomness and choice of hyperparameters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. SSAformer: Spatial–Spectral Aggregation Transformer for Hyperspectral Image Super-Resolution.
- Author
-
Wang, Haoqian, Zhang, Qi, Peng, Tao, Xu, Zhongjie, Cheng, Xiangai, Xing, Zhongyang, and Li, Teng
- Subjects
TRANSFORMER models ,HIGH resolution imaging ,CONVOLUTIONAL neural networks ,REMOTE sensing ,ENVIRONMENTAL monitoring ,SPECTRAL imaging ,IMAGE reconstruction algorithms - Abstract
The hyperspectral image (HSI) distinguishes itself in material identification through its exceptional spectral resolution. However, its spatial resolution is constrained by hardware limitations, prompting the evolution of HSI super-resolution (SR) techniques. Single HSI SR endeavors to reconstruct high-spatial-resolution HSI from low-spatial-resolution inputs, and recent progress in deep learning-based algorithms has significantly advanced the quality of reconstructed images. However, convolutional methods struggle to extract comprehensive spatial and spectral features. Transformer-based models have yet to harness long-range dependencies across both dimensions fully, thus inadequately integrating spatial and spectral data. To solve the above problem, in this paper, we propose a new HSI SR method, SSAformer, which merges the strengths of CNNs and Transformers. It introduces specially designed attention mechanisms for HSI, including spatial and spectral attention modules, and overcomes the previous challenges in extracting and amalgamating spatial and spectral information. Evaluations on benchmark datasets show that SSAformer surpasses contemporary methods in enhancing spatial details and preserving spectral accuracy, underscoring its potential to expand HSI's utility in various domains, such as environmental monitoring and remote sensing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Changes in the Water Area of an Inland River Terminal Lake (Taitma Lake) Driven by Climate Change and Human Activities, 2017–2022.
- Author
-
Zi, Feng, Wang, Yong, Lu, Shanlong, Ikhumhen, Harrison Odion, Fang, Chun, Li, Xinru, Wang, Nan, and Kuang, Xinya
- Subjects
ENDORHEIC lakes ,WATER resources development ,CONVOLUTIONAL neural networks ,LAKES ,DEEP learning ,CLIMATE change - Abstract
Constructed from a dataset capturing the seasonal and annual water body distribution of the lower Qarqan River in the Taitma Lake area from 2017 to 2022, and combined with the meteorological and hydraulic engineering data, the spatial and temporal change patterns of the Taitma Lake watershed area were determined. Analyses were conducted using Planetscope (PS) satellite images and a deep learning model. The results revealed the following: ① Deep learning-based water body extraction provides significantly greater accuracy than the conventional water body index approach. With an impressive accuracy of up to 96.0%, UPerNet was found to provide the most effective extraction results among the three convolutional neural networks (U-Net, DeeplabV3+, and UPerNet) used for semantic segmentation; ② Between 2017 and 2022, Taitma Lake's water area experienced a rapid decrease, with the distribution of water predominantly shifting towards the east–west direction more than the north–south. The shifts between 2017 and 2020 and between 2020 and 2022 were clearly discernible, with the latter stage (2020–2022) being more significant than the former (2017–2020); ③ According to observations, Taitma Lake's changing water area has been primarily influenced by human activity over the last six years. Based on the research findings of this paper, it was observed that this study provides a valuable scientific basis for water resource allocation aiming to balance the development of water resources in the middle and upper reaches of the Tarim and Qarqan Rivers, as well as for the ecological protection of the downstream Taitma Lake. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. MEA-EFFormer: Multiscale Efficient Attention with Enhanced Feature Transformer for Hyperspectral Image Classification.
- Author
-
Sun, Qian, Zhao, Guangrui, Fang, Yu, Fang, Chenrong, Sun, Le, and Li, Xingying
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,DEEP learning ,TRANSFORMER models ,FEATURE extraction - Abstract
Hyperspectral image classification (HSIC) has garnered increasing attention among researchers. While classical networks like convolution neural networks (CNNs) have achieved satisfactory results with the advent of deep learning, they are confined to processing local information. Vision transformers, despite being effective at establishing long-distance dependencies, face challenges in extracting high-representation features for high-dimensional images. In this paper, we present the multiscale efficient attention with enhanced feature transformer (MEA-EFFormer), which is designed for the efficient extraction of spectral–spatial features, leading to effective classification. MEA-EFFormer employs a multiscale efficient attention feature extraction module to initially extract 3D convolution features and applies effective channel attention to refine spectral information. Following this, 2D convolution features are extracted and integrated with local binary pattern (LBP) spatial information to augment their representation. Then, the processed features are fed into a spectral–spatial enhancement attention (SSEA) module that facilitates interactive enhancement of spectral–spatial information across the three dimensions. Finally, these features undergo classification through a transformer encoder. We evaluate MEA-EFFormer against several state-of-the-art methods on three datasets and demonstrate its outstanding HSIC performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Object-Based Semi-Supervised Spatial Attention Residual UNet for Urban High-Resolution Remote Sensing Image Classification.
- Author
-
Lu, Yuanbing, Li, Huapeng, Zhang, Ce, and Zhang, Shuqing
- Subjects
CONVOLUTIONAL neural networks ,DISTRIBUTION (Probability theory) ,WILCOXON signed-rank test ,DEEP learning ,LAND cover - Abstract
Accurate urban land cover information is crucial for effective urban planning and management. While convolutional neural networks (CNNs) demonstrate superior feature learning and prediction capabilities using image-level annotations, the inherent mixed-category nature of input image patches leads to classification errors along object boundaries. Fully convolutional neural networks (FCNs) excel at pixel-wise fine segmentation, making them less susceptible to heterogeneous content, but they require fully annotated dense image patches, which may not be readily available in real-world scenarios. This paper proposes an object-based semi-supervised spatial attention residual UNet (OS-ARU) model. First, multiscale segmentation is performed to obtain segments from a remote sensing image, and segments containing sample points are assigned the categories of the corresponding points, which are used to train the model. Then, the trained model predicts class probabilities for all segments. Each unlabeled segment's probability distribution is compared against those of labeled segments for similarity matching under a threshold constraint. Through label propagation, pseudo-labels are assigned to unlabeled segments exhibiting high similarity to labeled ones. Finally, the model is retrained using the augmented training set incorporating the pseudo-labeled segments. Comprehensive experiments on aerial image benchmarks for Vaihingen and Potsdam demonstrate that the proposed OS-ARU achieves higher classification accuracy than state-of-the-art models, including OCNN, 2OCNN, and standard OS-U, reaching an overall accuracy (OA) of 87.83% and 86.71%, respectively. The performance improvements over the baseline methods are statistically significant according to the Wilcoxon Signed-Rank Test. Despite using significantly fewer sparse annotations, this semi-supervised approach still achieves comparable accuracy to the same model under full supervision. The proposed method thus makes a step forward in substantially alleviating the heavy sampling burden of FCNs (densely sampled deep learning models) to effectively handle the complex issue of land cover information identification and classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Remote Sensing Image Dehazing via a Local Context-Enriched Transformer.
- Author
-
Nie, Jing, Xie, Jin, and Sun, Hanqing
- Subjects
TRANSFORMER models ,REMOTE sensing ,CONVOLUTIONAL neural networks ,IMAGE reconstruction ,IMAGE processing - Abstract
Remote sensing image dehazing is a well-known remote sensing image processing task focused on restoring clean images from hazy images. The Transformer network, based on the self-attention mechanism, has demonstrated remarkable advantages in various image restoration tasks, due to its capacity to capture long-range dependencies within images. However, it is weak at modeling local context. Conversely, convolutional neural networks (CNNs) are adept at capturing local contextual information. Local contextual information could provide more details, while long-range dependencies capture global structure information. The combination of long-range dependencies and local context modeling is beneficial for remote sensing image dehazing. Therefore, in this paper, we propose a CNN-based adaptive local context enrichment module (ALCEM) to extract contextual information within local regions. Subsequently, we integrate our proposed ALCEM into the multi-head self-attention and feed-forward network of the Transformer, constructing a novel locally enhanced attention (LEA) and a local continuous-enhancement feed-forward network (LCFN). The LEA utilizes the ALCEM to inject local context information that is complementary to the long-range relationship modeled by multi-head self-attention, which is beneficial to removing haze and restoring details. The LCFN extracts multi-scale spatial information and selectively fuses them by the the ALCEM, which supplements more informative information compared with existing regular feed-forward networks with only position-specific information flow. Powered by the LEA and LCFN, a novel Transformer-based dehazing network termed LCEFormer is proposed to restore clear images from hazy remote sensing images, which combines the advantages of CNN and Transformer. Experiments conducted on three distinct datasets, namely DHID, ERICE, and RSID, demonstrate that our proposed LCEFormer achieves the state-of-the-art performance in hazy scenes. Specifically, our LCEFormer outperforms DCIL by 0.78 dB and 0.018 for PSNR and SSIM on the DHID dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.