437 results
Search Results
2. Deep-Learning-Based Daytime COT Retrieval and Prediction Method Using FY4A AGRI Data.
- Author
-
Xu, Fanming, Song, Biao, Chen, Jianhua, Guan, Runda, Zhu, Rongjie, Liu, Jiayu, and Qiu, Zhongfeng
- Subjects
- *
CONVOLUTIONAL neural networks , *PREDICTION models , *DEEP learning , *FORECASTING - Abstract
The traditional method for retrieving cloud optical thickness (COT) is carried out through a Look-Up Table (LUT). Researchers must make a series of idealized assumptions and conduct extensive observations and record features in this scenario, consuming considerable resources. The emergence of deep learning effectively addresses the shortcomings of the traditional approach. In this paper, we first propose a daytime (SOZA < 70°) COT retrieval algorithm based on FY-4A AGRI. We establish and train a Convolutional Neural Network (CNN) model for COT retrieval, CM4CR, with the CALIPSO's COT product spatially and temporally synchronized as the ground truth. Then, a deep learning method extended from video prediction models is adopted to predict COT values based on the retrieval results obtained from CM4CR. The COT prediction model (CPM) consists of an encoder, a predictor, and a decoder. On this basis, we further incorporated a time embedding module to enhance the model's ability to learn from irregular time intervals in the input COT sequence. During the training phase, we employed Charbonnier Loss and Edge Loss to enhance the model's capability to represent COT details. Experiments indicate that our CM4CR outperforms existing COT retrieval methods, with predictions showing better performance across several metrics than other benchmark prediction models. Additionally, this paper also investigates the impact of different lengths of COT input sequences and the time intervals between adjacent frames of COT on prediction performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. A Method for Underwater Acoustic Target Recognition Based on the Delay-Doppler Joint Feature.
- Author
-
Du, Libin, Wang, Zhengkai, Lv, Zhichao, Han, Dongyue, Wang, Lei, Yu, Fei, and Lan, Qing
- Subjects
- *
CONVOLUTIONAL neural networks , *ARCHITECTURAL acoustics , *OBJECT recognition (Computer vision) , *FOURIER transforms - Abstract
With the aim of solving the problem of identifying complex underwater acoustic targets using a single signal feature in the Time–Frequency (TF) feature, this paper designs a method that recognizes the underwater targets based on the Delay-Doppler joint feature. First, this method uses symplectic finite Fourier transform (SFFT) to extract the Delay-Doppler features of underwater acoustic signals, analyzes the Time–Frequency features at the same time, and combines the Delay-Doppler (DD) feature and Time–Frequency feature to form a joint feature (TF-DD). This paper uses three types of convolutional neural networks to verify that TF-DD can effectively improve the accuracy of target recognition. Secondly, this paper designs an object recognition model (TF-DD-CNN) based on joint features as input, which simplifies the neural network's overall structure and improves the model's training efficiency. This research employs ship-radiated noise to validate the efficacy of TF-DD-CNN for target identification. The results demonstrate that the combined characteristic and the TF-DD-CNN model introduced in this study can proficiently detect ships, and the model notably enhances the precision of detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer.
- Author
-
Wang, Minhui, Sun, Yaxiu, Xiang, Jianhong, Sun, Rui, and Zhong, Yu
- Subjects
- *
TRANSFORMER models , *CONVOLUTIONAL neural networks , *LIDAR , *DIGITAL elevation models , *TRANSFER matrix , *DATA fusion (Statistics) - Abstract
Utilizing multi-modal data, as opposed to only hyperspectral image (HSI), enhances target identification accuracy in remote sensing. Transformers are applied to multi-modal data classification for their long-range dependency but often overlook intrinsic image structure by directly flattening image blocks into vectors. Moreover, as the encoder deepens, unprofitable information negatively impacts classification performance. Therefore, this paper proposes a learnable transformer with an adaptive gating mechanism (AGMLT). Firstly, a spectral–spatial adaptive gating mechanism (SSAGM) is designed to comprehensively extract the local information from images. It mainly contains point depthwise attention (PDWA) and asymmetric depthwise attention (ADWA). The former is for extracting spectral information of HSI, and the latter is for extracting spatial information of HSI and elevation information of LiDAR-derived rasterized digital surface models (LiDAR-DSM). By omitting linear layers, local continuity is maintained. Then, the layer Scale and learnable transition matrix are introduced to the original transformer encoder and self-attention to form the learnable transformer (L-Former). It improves data dynamics and prevents performance degradation as the encoder deepens. Subsequently, learnable cross-attention (LC-Attention) with the learnable transfer matrix is designed to augment the fusion of multi-modal data by enriching feature information. Finally, poly loss, known for its adaptability with multi-modal data, is employed in training the model. Experiments in the paper are conducted on four famous multi-modal datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and Houston2013 (HU). The results show that AGMLT achieves optimal performance over some existing models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Fire-Net: Rapid Recognition of Forest Fires in UAV Remote Sensing Imagery Using Embedded Devices.
- Author
-
Li, Shouliang, Han, Jiale, Chen, Fanghui, Min, Rudong, Yi, Sixue, and Yang, Zhen
- Subjects
- *
FOREST fires , *CONVOLUTIONAL neural networks , *FOREST monitoring , *DRONE aircraft , *WILDFIRES - Abstract
Forest fires pose a catastrophic threat to Earth's ecology as well as threaten human beings. Timely and accurate monitoring of forest fires can significantly reduce potential casualties and property damage. Thus, to address the aforementioned problems, this paper proposed an unmanned aerial vehicle (UAV) based on a lightweight forest fire recognition model, Fire-Net, which has a multi-stage structure and incorporates cross-channel attention following the fifth stage. This is to enable the model's ability to perceive features at various scales, particularly small-scale fire sources in wild forest scenes. Through training and testing on a real-world dataset, various lightweight convolutional neural networks were evaluated on embedded devices. The experimental outcomes indicate that Fire-Net attained an accuracy of 98.18%, a precision of 99.14%, and a recall of 98.01%, surpassing the current leading methods. Furthermore, the model showcases an average inference time of 10 milliseconds per image and operates at 86 frames per second (FPS) on embedded devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Real-Time Wildfire Monitoring Using Low-Altitude Remote Sensing Imagery.
- Author
-
Tong, Hongwei, Yuan, Jianye, Zhang, Jingjing, Wang, Haofei, and Li, Teng
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *DRONE aircraft , *SUMMER , *REMOTE sensing , *FIRE detectors - Abstract
With rising global temperatures, wildfires frequently occur worldwide during the summer season. The timely detection of these fires, based on unmanned aerial vehicle (UAV) images, can significantly reduce the damage they cause. Existing Convolutional Neural Network (CNN)-based fire detection methods usually use multiple convolutional layers to enhance the receptive fields, but this compromises real-time performance. This paper proposes a novel real-time semantic segmentation network called FireFormer, combining the strengths of CNNs and Transformers to detect fires. An agile ResNet18 as the encoding component tailored to fulfill the efficient fire segmentation is adopted here, and a Forest Fire Transformer Block (FFTB) rooted in the Transformer architecture is proposed as the decoding mechanism. Additionally, to accurately detect and segment small fire spots, we have developed a novel Feature Refinement Network (FRN) to enhance fire segmentation accuracy. The experimental results demonstrate that our proposed FireFormer achieves state-of-the-art performance on the publicly available forest fire dataset FLAME—specifically, with an impressive 73.13% IoU and 84.48% F1 Score. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. MBT-UNet: Multi-Branch Transform Combined with UNet for Semantic Segmentation of Remote Sensing Images.
- Author
-
Liu, Bin, Li, Bing, Sreeram, Victor, and Li, Shuofeng
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *REMOTE sensing , *ENVIRONMENTAL monitoring , *RESOURCE management - Abstract
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Radar Emitter Recognition Based on Spiking Neural Networks.
- Author
-
Luo, Zhenghao, Wang, Xingdong, Yuan, Shuo, and Liu, Zhangmeng
- Subjects
- *
ARTIFICIAL neural networks , *RADAR signal processing , *MILITARY electronics , *CONVOLUTIONAL neural networks , *ELECTRONIC measurements - Abstract
Efficient and effective radar emitter recognition is critical for electronic support measurement (ESM) systems. However, in complex electromagnetic environments, intercepted pulse trains generally contain substantial data noise, including spurious and missing pulses. Currently, radar emitter recognition methods utilizing traditional artificial neural networks (ANNs) like CNNs and RNNs are susceptible to data noise and require intensive computations, posing challenges to meeting the performance demands of modern ESM systems. Spiking neural networks (SNNs) exhibit stronger representational capabilities compared to traditional ANNs due to the temporal dynamics of spiking neurons and richer information encoded in precise spike timing. Furthermore, SNNs achieve higher computational efficiency by performing event-driven sparse addition calculations. In this paper, a lightweight spiking neural network is proposed by combining direct coding, leaky integrate-and-fire (LIF) neurons, and surrogate gradients to recognize radar emitters. Additionally, an improved SNN for radar emitter recognition is proposed, leveraging the local timing structure of pulses to enhance adaptability to data noise. Simulation results demonstrate the superior performance of the proposed method over existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation.
- Author
-
Ding, Hao, Xia, Bo, Liu, Weilin, Zhang, Zekai, Zhang, Jinglin, Wang, Xing, and Xu, Sen
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE sensing , *TRANSFORMER models , *COMPUTATIONAL complexity , *EARTHQUAKES - Abstract
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Intrapulse Modulation Radar Signal Recognition Using CNN with Second-Order STFT-Based Synchrosqueezing Transform.
- Author
-
Dong, Ning, Jiang, Hong, Liu, Yipeng, and Zhang, Jingtao
- Subjects
- *
CONVOLUTIONAL neural networks , *SIGNAL classification , *FOURIER transforms , *SIGNAL-to-noise ratio , *RADAR , *PHOTOPLETHYSMOGRAPHY - Abstract
Intrapulse modulation classification of radar signals plays an important role in modern electronic reconnaissance, countermeasures, etc. In this paper, to improve the recognition rate at low signal-to-noise ratio (SNR), we propose a recognition method using the second-order short-time Fourier transform (STFT)-based synchrosqueezing transform (FSST2) combined with a modified convolution neural network, which we name MeNet. In particular, the radar signals are first preprocessed via the time–frequency analysis and STFT-based FSST2. Then, the informative features of the time–frequency images (TFIs) are deeply learned and classified through the MeNet with several specific convolutional blocks. The simulation results show that the overall recognition rate for seven types of intrapulse modulation radar signals can reach 95.6%, even when the SNR is −12 dB. Compared with other networks, the excellent recognition rate proves the superiority of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. BAFormer: A Novel Boundary-Aware Compensation UNet-like Transformer for High-Resolution Cropland Extraction.
- Author
-
Li, Zhiyong, Wang, Youming, Tian, Fa, Zhang, Junbo, Chen, Yijie, and Li, Kunhong
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *DEEP learning , *REMOTE sensing , *FARMS - Abstract
Utilizing deep learning for semantic segmentation of cropland from remote sensing imagery has become a crucial technique in land surveys. Cropland is highly heterogeneous and fragmented, and existing methods often suffer from inaccurate boundary segmentation. This paper introduces a UNet-like boundary-aware compensation model (BAFormer). Cropland boundaries typically exhibit rapid transformations in pixel values and texture features, often appearing as high-frequency features in remote sensing images. To enhance the recognition of these high-frequency features as represented by cropland boundaries, the proposed BAFormer integrates a Feature Adaptive Mixer (FAM) and develops a Depthwise Large Kernel Multi-Layer Perceptron model (DWLK-MLP) to enrich the global and local cropland boundaries features separately. Specifically, FAM enhances the boundary-aware method by adaptively acquiring high-frequency features through convolution and self-attention advantages, while DWLK-MLP further supplements boundary position information using a large receptive field. The efficacy of BAFormer has been evaluated on datasets including Vaihingen, Potsdam, LoveDA, and Mapcup. It demonstrates high performance, achieving mIoU scores of 84.5%, 87.3%, 53.5%, and 83.1% on these datasets, respectively. Notably, BAFormer-T (lightweight model) surpasses other lightweight models on the Vaihingen dataset with scores of 91.3% F1 and 84.1% mIoU. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Graph Neural Networks in Point Clouds: A Survey.
- Author
-
Li, Dilong, Lu, Chenghui, Chen, Ziyi, Guan, Jianlong, Zhao, Jing, and Du, Jixiang
- Subjects
- *
GRAPH neural networks , *CONVOLUTIONAL neural networks , *NATURAL language processing , *OBJECT recognition (Computer vision) , *TRANSFORMER models - Abstract
With the advancement of 3D sensing technologies, point clouds are gradually becoming the main type of data representation in applications such as autonomous driving, robotics, and augmented reality. Nevertheless, the irregularity inherent in point clouds presents numerous challenges for traditional deep learning frameworks. Graph neural networks (GNNs) have demonstrated their tremendous potential in processing graph-structured data and are widely applied in various domains including social media data analysis, molecular structure calculation, and computer vision. GNNs, with their capability to handle non-Euclidean data, offer a novel approach for addressing these challenges. Additionally, drawing inspiration from the achievements of transformers in natural language processing, graph transformers have propelled models towards global awareness, overcoming the limitations of local aggregation mechanisms inherent in early GNN architectures. This paper provides a comprehensive review of GNNs and graph-based methods in point cloud applications, adopting a task-oriented perspective to analyze this field. We categorize GNN methods for point clouds based on fundamental tasks, such as segmentation, classification, object detection, registration, and other related tasks. For each category, we summarize the existing mainstream methods, conduct a comprehensive analysis of their performance on various datasets, and discuss the development trends and future prospects of graph-based methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province.
- Author
-
Yang, Rui, Qi, Yuan, Zhang, Hui, Wang, Hongwei, Zhang, Jinlong, Ma, Xiaofang, Zhang, Juan, and Ma, Chao
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *REMOTE sensing , *CROPS , *STANDARD deviations , *IMAGE segmentation , *CROP quality , *PRECISION farming - Abstract
The timely and accurate acquisition of information on the distribution of the crop planting structure in the Loess Plateau of eastern Gansu Province, one of the most important agricultural areas in Western China, is crucial for promoting fine management of agriculture and ensuring food security. This study uses multi-temporal high-resolution remote sensing images to determine optimal segmentation scales for various crops, employing the estimation of scale parameter 2 (ESP2) tool and the Ratio of Mean Absolute Deviation to Standard Deviation (RMAS) model. The Canny edge detection algorithm is then applied for multi-scale image segmentation. By incorporating crop phenological factors and using the L1-regularized logistic regression model, we optimized 39 spatial feature factors—including spectral, textural, geometric, and index features. Within a multi-level classification framework, the Random Forest (RF) classifier and Convolutional Neural Network (CNN) model are used to classify the cropping patterns in four test areas based on the multi-scale segmented images. The results indicate that integrating the Canny edge detection algorithm with the optimal segmentation scales calculated using the ESP2 tool and RMAS model produces crop parcels with more complete boundaries and better separability. Additionally, optimizing spatial features using the L1-regularized logistic regression model, combined with phenological information, enhances classification accuracy. Within the OBIC framework, the RF classifier achieves higher accuracy in classifying cropping patterns. The overall classification accuracies for the four test areas are 91.93%, 94.92%, 89.37%, and 90.68%, respectively. This paper introduced crop phenological factors, effectively improving the extraction precision of the shattered agricultural planting structure in the Loess Plateau of eastern Gansu Province. Its findings have important application value in crop monitoring, management, food security and other related fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Application of Deep Learning for Segmenting Seepages in Levee Systems.
- Author
-
Panta, Manisha, Thapa, Padam Jung, Hoque, Md Tamjidul, Niles, Kendall N., Sloan, Steve, Flanagin, Maik, Pathak, Ken, and Abdelguerfi, Mahdi
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *LEVEES - Abstract
Seepage is a typical hydraulic factor that can initiate the breaching process in a levee system. If not identified and treated on time, seepages can be a severe problem for levees, weakening the levee structure and eventually leading to collapse. Therefore, it is essential always to be vigilant with regular monitoring procedures to identify seepages throughout these levee systems and perform adequate repairs to limit potential threats from unforeseen levee failures. This paper introduces a fully convolutional neural network to identify and segment seepage from the image in levee systems. To the best of our knowledge, this is the first work in this domain. Applying deep learning techniques for semantic segmentation tasks in real-world scenarios has its own challenges, especially the difficulty for models to effectively learn from complex backgrounds while focusing on simpler objects of interest. This challenge is particularly evident in the task of detecting seepages in levee systems, where the fault is relatively simple compared to the complex and varied background. We addressed this problem by introducing negative images and a controlled transfer learning approach for semantic segmentation for accurate seepage segmentation in levee systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Deep Learning-Based Detection of Oil Spills in Pakistan's Exclusive Economic Zone from January 2017 to December 2023.
- Author
-
Basit, Abdul, Siddique, Muhammad Adnan, Bashir, Salman, Naseer, Ehtasham, and Sarfraz, Muhammad Saquib
- Subjects
- *
CONVOLUTIONAL neural networks , *OIL spills , *OIL seepage , *ALGAL blooms , *TOXIC algae , *MARINE accidents , *INSPECTION & review - Abstract
Oil spillages on a sea's or an ocean's surface are a threat to marine and coastal ecosystems. They are mainly caused by ship accidents, illegal discharge of oil from ships during cleaning and oil seepage from natural reservoirs. Synthetic-Aperture Radar (SAR) has proved to be a useful tool for analyzing oil spills, because it operates in all-day, all-weather conditions. An oil spill can typically be seen as a dark stretch in SAR images and can often be detected through visual inspection. The major challenge is to differentiate oil spills from look-alikes, i.e., low-wind areas, algae blooms and grease ice, etc., that have a dark signature similar to that of an oil spill. It has been noted over time that oil spill events in Pakistan's territorial waters often remain undetected until the oil reaches the coastal regions or it is located by concerned authorities during patrolling. A formal remote sensing-based operational framework for oil spills detection in Pakistan's Exclusive Economic Zone (EEZ) in the Arabian Sea is urgently needed. In this paper, we report the use of an encoder–decoder-based convolutional neural network trained on an annotated dataset comprising selected oil spill events verified by the European Maritime Safety Agency (EMSA). The dataset encompasses multiple classes, viz., sea surface, oil spill, look-alikes, ships and land. We processed Sentinel-1 acquisitions over the EEZ from January 2017 to December 2023, and we thereby prepared a repository of SAR images for the aforementioned duration. This repository contained images that had been vetted by SAR experts, to trace and confirm oil spills. We tested the repository using the trained model, and, to our surprise, we detected 92 previously unreported oil spill events within those seven years. In 2020, our model detected 26 oil spills in the EEZ, which corresponds to the highest number of spills detected in a single year; whereas in 2023, our model detected 10 oil spill events. In terms of the total surface area covered by the spills, the worst year was 2021, with a cumulative 395 sq. km covered in oil or an oil-like substance. On the whole, these are alarming figures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Global-Local Collaborative Learning Network for Optical Remote Sensing Image Change Detection.
- Author
-
Li, Jinghui, Shao, Feng, Liu, Qiang, and Meng, Xiangchao
- Subjects
- *
OPTICAL remote sensing , *COLLABORATIVE learning , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *REMOTE sensing , *ARTIFICIAL satellites - Abstract
Due to the widespread applications of change detection technology in urban change analysis, environmental monitoring, agricultural surveillance, disaster detection, and other domains, the task of change detection has become one of the primary applications of Earth orbit satellite remote sensing data. However, the analysis of dual-temporal change detection (CD) remains a challenge in high-resolution optical remote sensing images due to the complexities in remote sensing images, such as intricate textures, seasonal variations in imaging time, climatic differences, and significant differences in the sizes of various objects. In this paper, we propose a novel U-shaped architecture for change detection. In the encoding stage, a multi-branch feature extraction module is employed by combining CNN and transformer networks to enhance the network's perception capability for objects of varying sizes. Furthermore, a multi-branch aggregation module is utilized to aggregate features from different branches, providing the network with global attention while preserving detailed information. For dual-temporal features, we introduce a spatiotemporal discrepancy perception module to model the context of dual-temporal images. Particularly noteworthy is the construction of channel attention and token attention modules based on the transformer attention mechanism to facilitate information interaction between multi-level features, thereby enhancing the network's contextual awareness. The effectiveness of the proposed network is validated on three public datasets, demonstrating its superior performance over other state-of-the-art methods through qualitative and quantitative experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Lightweight Pedestrian Detection Network for UAV Remote Sensing Images Based on Strideless Pooling.
- Author
-
Liu, Sanzai, Cao, Lihua, and Li, Yi
- Subjects
- *
OBJECT recognition (Computer vision) , *PEDESTRIANS , *TRAFFIC monitoring , *CONVOLUTIONAL neural networks , *EMERGENCY management - Abstract
The need for pedestrian target detection in uncrewed aerial vehicle (UAV) remote sensing images has become increasingly significant as the technology continues to evolve. UAVs equipped with high-resolution cameras can capture detailed imagery of various scenarios, making them ideal for monitoring and surveillance applications. Pedestrian detection is particularly crucial in scenarios such as traffic monitoring, security surveillance, and disaster response, where the safety and well-being of individuals are paramount. However, pedestrian detection in UAV remote sensing images poses several challenges. Firstly, the small size of pedestrians relative to the overall image, especially at higher altitudes, makes them difficult to detect. Secondly, the varying backgrounds and lighting conditions in remote sensing images can further complicate the task of detection. Traditional object detection methods often struggle to handle these complexities, resulting in decreased detection accuracy and increased false positives. Addressing the aforementioned concerns, this paper proposes a lightweight object detection model that integrates GhostNet and YOLOv5s. Building upon this foundation, we further introduce the SPD-Conv module to the model. With this addition, the aim is to preserve fine-grained features of the images during downsampling, thereby enhancing the model's capability to recognize small-scale objects. Furthermore, the coordinate attention module is introduced to further improve the model's recognition accuracy. In the proposed model, the number of parameters is successfully reduced to 4.77 M, compared with 7.01 M in YOLOv5s, representing a 32% reduction. The mean average precision (mAP) increased from 0.894 to 0.913, reflecting a 1.9% improvement. We have named the proposed model "GSC-YOLO". This study holds significant importance in advancing the lightweighting of UAV target detection models and addressing the challenges associated with complex scene object detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. MFPANet: Multi-Scale Feature Perception and Aggregation Network for High-Resolution Snow Depth Estimation.
- Author
-
Zhao, Liling, Chen, Junyu, Shahzad, Muhammad, Xia, Min, and Lin, Haifeng
- Subjects
- *
SNOW accumulation , *MICROWAVE remote sensing , *SYNTHETIC aperture radar , *REMOTE-sensing images , *DEPTH perception , *REMOTE sensing , *AVALANCHES - Abstract
Accurate snow depth estimation is of significant importance, particularly for preventing avalanche disasters and predicting flood seasons. The predominant approaches for such snow depth estimation, based on deep learning methods, typically rely on passive microwave remote sensing data. However, due to the low resolution of passive microwave remote sensing data, it often results in low-accuracy outcomes, posing considerable limitations in application. To further improve the accuracy of snow depth estimation, in this paper, we used active microwave remote sensing data. We fused multi-spectral optical satellite images, synthetic aperture radar (SAR) images and land cover distribution images to generate a snow remote sensing dataset (SRSD). It is a first-of-its-kind dataset that includes active microwave remote sensing images in high-latitude regions of Asia. Using these novel data, we proposed a multi-scale feature perception and aggregation neural network (MFPANet) that focuses on improving feature extraction from multi-source images. Our systematic analysis reveals that the proposed approach is not only robust but also achieves high accuracy in snow depth estimation compared to existing state-of-the-art methods, with RMSE of 0.360 and with MAE of 0.128. Finally, we selected several representative areas in our study region and applied our method to map snow depth distribution, demonstrating its broad application prospects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Hyperspectral Image Denoising Based on Deep and Total Variation Priors.
- Author
-
Wang, Peng, Sun, Tianman, Chen, Yiming, Ge, Lihua, Wang, Xiaoyi, and Wang, Liguo
- Subjects
- *
DEEP learning , *IMAGE denoising , *CONVOLUTIONAL neural networks , *SPECTRAL imaging , *SPARSE matrices , *STOCHASTIC processes - Abstract
To address the problems of noise interference and image blurring in hyperspectral imaging (HSI), this paper proposes a denoising method for HSI based on deep learning and a total variation (TV) prior. The method minimizes the first-order moment distance between the deep prior of a Fast and Flexible Denoising Convolutional Neural Network (FFDNet) and the Enhanced 3D TV (E3DTV) prior, obtaining dual priors that complement and reinforce each other's advantages. Specifically, the original HSI is initially processed with a random binary sparse observation matrix to achieve a sparse representation. Subsequently, the plug-and-play (PnP) algorithm is employed within the framework of generalized alternating projection (GAP) to denoise the sparsely represented HSI. Experimental results demonstrate that, compared to existing methods, this method shows significant advantages in both quantitative and qualitative assessments, effectively enhancing the quality of HSIs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Transfer Learning-Based Specific Emitter Identification for ADS-B over Satellite System.
- Author
-
Liu, Mingqian, Chai, Yae, Li, Ming, Wang, Jiakun, and Zhao, Nan
- Subjects
- *
CONVOLUTIONAL neural networks , *LOW earth orbit satellites , *AUTOMATIC dependent surveillance-broadcast , *HUMAN fingerprints , *FEATURE extraction , *DISTRIBUTED sensors - Abstract
In future aviation surveillance, the demand for higher real-time updates for global flights can be met by deploying automatic dependent surveillance–broadcast (ADS-B) receivers on low Earth orbit satellites, capitalizing on their global coverage and terrain-independent capabilities for seamless monitoring. Specific emitter identification (SEI) leverages the distinctive features of ADS-B data. High data collection and annotation costs, along with limited dataset size, can lead to overfitting during training and low model recognition accuracy. Transfer learning, which does not require source and target domain data to share the same distribution, significantly reduces the sensitivity of traditional models to data volume and distribution. It can also address issues related to the incompleteness and inadequacy of communication emitter datasets. This paper proposes a distributed sensor system based on transfer learning to address the specific emitter identification. Firstly, signal fingerprint features are extracted using a bispectrum transform (BST) to train a convolutional neural network (CNN) preliminarily. Decision fusion is employed to tackle the challenges of the distributed system. Subsequently, a transfer learning strategy is employed, incorporating frozen model parameters, maximum mean discrepancy (MMD), and classification error measures to reduce the disparity between the target and source domains. A hyperbolic space module is introduced before the output layer to enhance the expressive capacity and data information extraction. After iterative training, the transfer learning model is obtained. Simulation results confirm that this method enhances model generalization, addresses the issue of slow convergence, and leads to improved training accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Improving Artificial-Intelligence-Based Individual Tree Species Classification Using Pseudo Tree Crown Derived from Unmanned Aerial Vehicle Imagery.
- Author
-
Miao, Shengjie, Zhang, Kongwen, Zeng, Hongda, and Liu, Jane
- Subjects
- *
CROWNS (Botany) , *DRONE aircraft , *CONVOLUTIONAL neural networks , *LANDSAT satellites , *URBAN trees , *ARTIFICIAL intelligence - Abstract
Urban tree classification enables informed decision-making processes in urban planning and management. This paper introduces a novel data reformation method, pseudo tree crown (PTC), which enhances the feature difference in the input layer and results in the improvement of the accuracy and efficiency of urban tree classification by utilizing artificial intelligence (AI) techniques. The study involved a comparative analysis of the performance of various machine learning (ML) classifiers. The results revealed a significant enhancement in classification accuracy, with an improvement exceeding 10% observed when high spatial resolution imagery captured by an unmanned aerial vehicle (UAV) was utilized. Furthermore, the study found an impressive average classification accuracy of 93% achieved by a classifier built on the PyTorch framework, with ResNet50 leveraged as its convolutional neural network layer. These findings underscore the potential of AI-driven approaches in advancing urban tree classification methodologies for enhanced urban planning and management practices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Combining "Deep Learning" and Physically Constrained Neural Networks to Derive Complex Glaciological Change Processes from Modern High-Resolution Satellite Imagery: Application of the GEOCLASS-Image System to Create VarioCNN for Glacier Surges.
- Author
-
Herzfeld, Ute C., Hessburg, Lawrence J., Trantow, Thomas M., and Hayes, Adam N.
- Subjects
- *
REMOTE-sensing images , *CONVOLUTIONAL neural networks , *DEEP learning , *GLACIERS , *IMAGE recognition (Computer vision) , *ACCELERATION (Mechanics) - Abstract
The objectives of this paper are to investigate the trade-offs between a physically constrained neural network and a deep, convolutional neural network and to design a combined ML approach ("VarioCNN"). Our solution is provided in the framework of a cyberinfrastructure that includes a newly designed ML software, GEOCLASS-image (v1.0), modern high-resolution satellite image data sets (Maxar WorldView data), and instructions/descriptions that may facilitate solving similar spatial classification problems. Combining the advantages of the physically-driven connectionist-geostatistical classification method with those of an efficient CNN, VarioCNN provides a means for rapid and efficient extraction of complex geophysical information from submeter resolution satellite imagery. A retraining loop overcomes the difficulties of creating a labeled training data set. Computational analyses and developments are centered on a specific, but generalizable, geophysical problem: The classification of crevasse types that form during the surge of a glacier system. A surge is a glacial catastrophe, an acceleration of a glacier to typically 100–200 times its normal velocity. GEOCLASS-image is applied to study the current (2016-2024) surge in the Negribreen Glacier System, Svalbard. The geophysical result is a description of the structural evolution and expansion of the surge, based on crevasse types that capture ice deformation in six simplified classes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Vulnerable Road User Skeletal Pose Estimation Using mmWave Radars.
- Author
-
Zeng, Zhiyuan, Liang, Xingdong, Li, Yanlei, and Dang, Xiangwei
- Subjects
- *
ROAD users , *TRACKING radar , *RADAR targets , *CONVOLUTIONAL neural networks , *RADAR signal processing , *DATA augmentation - Abstract
A skeletal pose estimation method, named RVRU-Pose, is proposed to estimate the skeletal pose of vulnerable road users based on distributed non-coherent mmWave radar. In view of the limitation that existing methods for skeletal pose estimation are only applicable to small scenes, this paper proposes a strategy that combines radar intensity heatmaps and coordinate heatmaps as input to a deep learning network. In addition, we design a multi-resolution data augmentation and training method suitable for radar to achieve target pose estimation for remote and multi-target application scenarios. Experimental results show that RVRU-Pose can achieve better than 2 cm average localization accuracy for different subjects in different scenarios, which is superior in terms of accuracy and time compared to existing state-of-the-art methods for human skeletal pose estimation with radar. As an essential performance parameter of radar, the impact of angular resolution on the estimation accuracy of a skeletal pose is quantitatively analyzed and evaluated in this paper. Finally, RVRU-Pose has also been extended to the task of estimating the skeletal pose of a cyclist, reflecting the strong scalability of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. LDnADMM-Net: A Denoising Unfolded Deep Neural Network for Direction-of-Arrival Estimations in A Low Signal-to-Noise Ratio.
- Author
-
Liang, Can, Liu, Mingxuan, Li, Yang, Wang, Yanhua, and Hu, Xueyao
- Subjects
- *
DIRECTION of arrival estimation , *CONVOLUTIONAL neural networks , *SIGNAL-to-noise ratio , *COMPRESSED sensing , *SIGNAL denoising - Abstract
In this paper, we explore the problem of direction-of-arrival (DOA) estimation for a non-uniform linear array (NULA) under strong noise. The compressed sensing (CS)-based methods are widely used in NULA DOA estimations. However, these methods commonly rely on the tuning of parameters, which are hard to fine-tune. Additionally, these methods lack robustness under strong noise. To address these issues, this paper proposes a novel DOA estimation approach using a deep neural network (DNN) for a NULA in a low SNR. The proposed network is designed based on the denoising convolutional neural network (DnCNN) and the alternating direction method of multipliers (ADMM), which is dubbed as LDnADMM-Net. First, we construct an unfolded DNN architecture that mimics the behavior of the iterative processing of an ADMM. In this way, the parameters of an ADMM can be transformed into the network weights, and thus we can adaptively optimize these parameters through network training. Then, we employ the DnCNN to develop a denoising module (DnM) and integrate it into the unfolded DNN. Using this DnM, we can enhance the anti-noise ability of the proposed network and obtain a robust DOA estimation in a low SNR. The simulation and experimental results show that the proposed LDnADMM-Net can obtain high-accuracy and super-resolution DOA estimations for a NULA with strong robustness in a low signal-to-noise ratio (SNR). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Target Detection Method for High-Frequency Surface Wave Radar RD Spectrum Based on (VI)CFAR-CNN and Dual-Detection Maps Fusion Compensation.
- Author
-
Ji, Yuanzheng, Liu, Aijun, Chen, Xuekun, Wang, Jiaqi, and Yu, Changjun
- Subjects
- *
CONVOLUTIONAL neural networks , *TRACKING algorithms , *AUTOMATIC identification - Abstract
This paper proposes a method for the intelligent detection of high-frequency surface wave radar (HFSWR) targets. This method cascades the adaptive constant false alarm (CFAR) detector variability index (VI) with the convolutional neural network (CNN) to form a cascade detector (VI)CFAR-CNN. First, the (VI)CFAR algorithm is used for the first-level detection of the range–Doppler (RD) spectrum; based on this result, the two-dimensional window slice data are extracted using the window with the position of the target on the RD spectrum as the center, and input into the CNN model to carry out further target and clutter identification. When the detection rate of the detector reaches a certain level and cannot be further improved due to the convergence of the CNN model, this paper uses a dual-detection maps fusion method to compensate for the loss of detection performance. First, the optimized parameters are used to perform the weighted fusion of the dual-detection maps, and then, the connected components in the fused detection map are further processed to achieve an independent (VI)CFAR to compensate for the (VI)CFAR-CNN detection results. Due to the difficulty in obtaining HFSWR data that include comprehensive and accurate target truth values, this paper adopts a method of embedding targets into the measured background to construct the RD spectrum dataset for HFSWR. At the same time, the proposed method is compared with various other methods to demonstrate its superiority. Additionally, a small amount of automatic identification system (AIS) and radar correlation data are used to verify the effectiveness and feasibility of this method on completely measured HFSWR data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. SSAformer: Spatial–Spectral Aggregation Transformer for Hyperspectral Image Super-Resolution.
- Author
-
Wang, Haoqian, Zhang, Qi, Peng, Tao, Xu, Zhongjie, Cheng, Xiangai, Xing, Zhongyang, and Li, Teng
- Subjects
- *
TRANSFORMER models , *HIGH resolution imaging , *CONVOLUTIONAL neural networks , *REMOTE sensing , *ENVIRONMENTAL monitoring , *SPECTRAL imaging , *IMAGE reconstruction algorithms - Abstract
The hyperspectral image (HSI) distinguishes itself in material identification through its exceptional spectral resolution. However, its spatial resolution is constrained by hardware limitations, prompting the evolution of HSI super-resolution (SR) techniques. Single HSI SR endeavors to reconstruct high-spatial-resolution HSI from low-spatial-resolution inputs, and recent progress in deep learning-based algorithms has significantly advanced the quality of reconstructed images. However, convolutional methods struggle to extract comprehensive spatial and spectral features. Transformer-based models have yet to harness long-range dependencies across both dimensions fully, thus inadequately integrating spatial and spectral data. To solve the above problem, in this paper, we propose a new HSI SR method, SSAformer, which merges the strengths of CNNs and Transformers. It introduces specially designed attention mechanisms for HSI, including spatial and spectral attention modules, and overcomes the previous challenges in extracting and amalgamating spatial and spectral information. Evaluations on benchmark datasets show that SSAformer surpasses contemporary methods in enhancing spatial details and preserving spectral accuracy, underscoring its potential to expand HSI's utility in various domains, such as environmental monitoring and remote sensing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Changes in the Water Area of an Inland River Terminal Lake (Taitma Lake) Driven by Climate Change and Human Activities, 2017–2022.
- Author
-
Zi, Feng, Wang, Yong, Lu, Shanlong, Ikhumhen, Harrison Odion, Fang, Chun, Li, Xinru, Wang, Nan, and Kuang, Xinya
- Subjects
- *
ENDORHEIC lakes , *WATER resources development , *CONVOLUTIONAL neural networks , *LAKES , *DEEP learning , *CLIMATE change - Abstract
Constructed from a dataset capturing the seasonal and annual water body distribution of the lower Qarqan River in the Taitma Lake area from 2017 to 2022, and combined with the meteorological and hydraulic engineering data, the spatial and temporal change patterns of the Taitma Lake watershed area were determined. Analyses were conducted using Planetscope (PS) satellite images and a deep learning model. The results revealed the following: ① Deep learning-based water body extraction provides significantly greater accuracy than the conventional water body index approach. With an impressive accuracy of up to 96.0%, UPerNet was found to provide the most effective extraction results among the three convolutional neural networks (U-Net, DeeplabV3+, and UPerNet) used for semantic segmentation; ② Between 2017 and 2022, Taitma Lake's water area experienced a rapid decrease, with the distribution of water predominantly shifting towards the east–west direction more than the north–south. The shifts between 2017 and 2020 and between 2020 and 2022 were clearly discernible, with the latter stage (2020–2022) being more significant than the former (2017–2020); ③ According to observations, Taitma Lake's changing water area has been primarily influenced by human activity over the last six years. Based on the research findings of this paper, it was observed that this study provides a valuable scientific basis for water resource allocation aiming to balance the development of water resources in the middle and upper reaches of the Tarim and Qarqan Rivers, as well as for the ecological protection of the downstream Taitma Lake. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. MEA-EFFormer: Multiscale Efficient Attention with Enhanced Feature Transformer for Hyperspectral Image Classification.
- Author
-
Sun, Qian, Zhao, Guangrui, Fang, Yu, Fang, Chenrong, Sun, Le, and Li, Xingying
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *DEEP learning , *TRANSFORMER models , *FEATURE extraction - Abstract
Hyperspectral image classification (HSIC) has garnered increasing attention among researchers. While classical networks like convolution neural networks (CNNs) have achieved satisfactory results with the advent of deep learning, they are confined to processing local information. Vision transformers, despite being effective at establishing long-distance dependencies, face challenges in extracting high-representation features for high-dimensional images. In this paper, we present the multiscale efficient attention with enhanced feature transformer (MEA-EFFormer), which is designed for the efficient extraction of spectral–spatial features, leading to effective classification. MEA-EFFormer employs a multiscale efficient attention feature extraction module to initially extract 3D convolution features and applies effective channel attention to refine spectral information. Following this, 2D convolution features are extracted and integrated with local binary pattern (LBP) spatial information to augment their representation. Then, the processed features are fed into a spectral–spatial enhancement attention (SSEA) module that facilitates interactive enhancement of spectral–spatial information across the three dimensions. Finally, these features undergo classification through a transformer encoder. We evaluate MEA-EFFormer against several state-of-the-art methods on three datasets and demonstrate its outstanding HSIC performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Locating and Grading of Lidar-Observed Aircraft Wake Vortex Based on Convolutional Neural Networks.
- Author
-
Zhang, Xinyu, Zhang, Hongwei, Wang, Qichao, Liu, Xiaoying, Liu, Shouxin, Zhang, Rongchuan, Li, Rongzhong, and Wu, Songhua
- Subjects
- *
CONVOLUTIONAL neural networks , *DOPPLER lidar , *AERONAUTICAL safety measures - Abstract
Aircraft wake vortices are serious threats to aviation safety. The Pulsed Coherent Doppler Lidar (PCDL) has been widely used in the observation of aircraft wake vortices due to its advantages of high spatial-temporal resolution and high precision. However, the post-processing algorithms require significant computing resources, which cannot achieve the real-time detection of a wake vortex (WV). This paper presents an improved Convolutional Neural Network (CNN) method for WV locating and grading based on PCDL data to avoid the influence of unstable ambient wind fields on the localization and classification results of WV. Typical WV cases are selected for analysis, and the WV locating and grading models are validated on different test sets. The consistency of the analytical algorithm and the CNN algorithm is verified. The results indicate that the improved CNN method achieves satisfactory recognition accuracy with higher efficiency and better robustness, especially in the case of strong turbulence, where the CNN method recognizes the wake vortex while the analytical method cannot. The improved CNN method is expected to be applied to optimize the current aircraft spacing criteria, which is promising in terms of aviation safety and economic benefit improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Object-Based Semi-Supervised Spatial Attention Residual UNet for Urban High-Resolution Remote Sensing Image Classification.
- Author
-
Lu, Yuanbing, Li, Huapeng, Zhang, Ce, and Zhang, Shuqing
- Subjects
- *
CONVOLUTIONAL neural networks , *DISTRIBUTION (Probability theory) , *WILCOXON signed-rank test , *DEEP learning , *LAND cover - Abstract
Accurate urban land cover information is crucial for effective urban planning and management. While convolutional neural networks (CNNs) demonstrate superior feature learning and prediction capabilities using image-level annotations, the inherent mixed-category nature of input image patches leads to classification errors along object boundaries. Fully convolutional neural networks (FCNs) excel at pixel-wise fine segmentation, making them less susceptible to heterogeneous content, but they require fully annotated dense image patches, which may not be readily available in real-world scenarios. This paper proposes an object-based semi-supervised spatial attention residual UNet (OS-ARU) model. First, multiscale segmentation is performed to obtain segments from a remote sensing image, and segments containing sample points are assigned the categories of the corresponding points, which are used to train the model. Then, the trained model predicts class probabilities for all segments. Each unlabeled segment's probability distribution is compared against those of labeled segments for similarity matching under a threshold constraint. Through label propagation, pseudo-labels are assigned to unlabeled segments exhibiting high similarity to labeled ones. Finally, the model is retrained using the augmented training set incorporating the pseudo-labeled segments. Comprehensive experiments on aerial image benchmarks for Vaihingen and Potsdam demonstrate that the proposed OS-ARU achieves higher classification accuracy than state-of-the-art models, including OCNN, 2OCNN, and standard OS-U, reaching an overall accuracy (OA) of 87.83% and 86.71%, respectively. The performance improvements over the baseline methods are statistically significant according to the Wilcoxon Signed-Rank Test. Despite using significantly fewer sparse annotations, this semi-supervised approach still achieves comparable accuracy to the same model under full supervision. The proposed method thus makes a step forward in substantially alleviating the heavy sampling burden of FCNs (densely sampled deep learning models) to effectively handle the complex issue of land cover information identification and classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Remote Sensing Image Dehazing via a Local Context-Enriched Transformer.
- Author
-
Nie, Jing, Xie, Jin, and Sun, Hanqing
- Subjects
- *
TRANSFORMER models , *REMOTE sensing , *CONVOLUTIONAL neural networks , *IMAGE reconstruction , *IMAGE processing - Abstract
Remote sensing image dehazing is a well-known remote sensing image processing task focused on restoring clean images from hazy images. The Transformer network, based on the self-attention mechanism, has demonstrated remarkable advantages in various image restoration tasks, due to its capacity to capture long-range dependencies within images. However, it is weak at modeling local context. Conversely, convolutional neural networks (CNNs) are adept at capturing local contextual information. Local contextual information could provide more details, while long-range dependencies capture global structure information. The combination of long-range dependencies and local context modeling is beneficial for remote sensing image dehazing. Therefore, in this paper, we propose a CNN-based adaptive local context enrichment module (ALCEM) to extract contextual information within local regions. Subsequently, we integrate our proposed ALCEM into the multi-head self-attention and feed-forward network of the Transformer, constructing a novel locally enhanced attention (LEA) and a local continuous-enhancement feed-forward network (LCFN). The LEA utilizes the ALCEM to inject local context information that is complementary to the long-range relationship modeled by multi-head self-attention, which is beneficial to removing haze and restoring details. The LCFN extracts multi-scale spatial information and selectively fuses them by the the ALCEM, which supplements more informative information compared with existing regular feed-forward networks with only position-specific information flow. Powered by the LEA and LCFN, a novel Transformer-based dehazing network termed LCEFormer is proposed to restore clear images from hazy remote sensing images, which combines the advantages of CNN and Transformer. Experiments conducted on three distinct datasets, namely DHID, ERICE, and RSID, demonstrate that our proposed LCEFormer achieves the state-of-the-art performance in hazy scenes. Specifically, our LCEFormer outperforms DCIL by 0.78 dB and 0.018 for PSNR and SSIM on the DHID dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Detection of Military Targets on Ground and Sea by UAVs with Low-Altitude Oblique Perspective.
- Author
-
Zeng, Bohan, Gao, Shan, Xu, Yuelei, Zhang, Zhaoxiang, Li, Fan, and Wang, Chenghang
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models - Abstract
Small-scale low-altitude unmanned aerial vehicles (UAVs) equipped with perception capability for military targets will become increasingly essential for strategic reconnaissance and stationary patrols in the future. To respond to challenges such as complex terrain and weather variations, as well as the deception and camouflage of military targets, this paper proposes a hybrid detection model that combines Convolutional Neural Network (CNN) and Transformer architecture in a decoupled manner. The proposed detector consists of the C-branch and the T-branch. In the C-branch, Multi-gradient Path Network (MgpNet) is introduced, inspired by the multi-gradient flow strategy, excelling in capturing the local feature information of an image. In the T-branch, RPFormer, a Region–Pixel two-stage attention mechanism, is proposed to aggregate the global feature information of the whole image. A feature fusion strategy is proposed to merge the feature layers of the two branches, further improving the detection accuracy. Furthermore, to better simulate real UAVs' reconnaissance environments, we construct a dataset of military targets in complex environments captured from an oblique perspective to evaluate the proposed detector. In ablation experiments, different fusion methods are validated, and the results demonstrate the effectiveness of the proposed fusion strategy. In comparative experiments, the proposed detector outperforms most advanced general detectors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Prediction of Sea Surface Temperature Using U-Net Based Model.
- Author
-
Ren, Jing, Wang, Changying, Sun, Ling, Huang, Baoxiang, Zhang, Deyu, Mu, Jiadong, and Wu, Jianqiang
- Subjects
- *
OCEAN temperature , *CONVOLUTIONAL neural networks - Abstract
Sea surface temperature (SST) is a key parameter in ocean hydrology. Currently, existing SST prediction methods fail to fully utilize the potential spatial correlation between variables. To address this challenge, we propose a spatiotenporal UNet (ST-UNet) model based on the UNet model. In particular, in the encoding phase of ST-UNet, we use parallel convolution with different kernel sizes to efficiently extract spatial features, and use ConvLSTM to capture temporal features based on the utilization of spatial features. Atrous Spatial Pyramid Pooling (ASPP) module is placed at the bottleneck of the network to further incorporate the multi-scale features, allowing the spatial features to be fully utilized. The final prediction is then generated in the decoding stage using parallel convolution with different kernel sizes similar to the encoding stage. We conducted a series of experiments on the Bohai Sea and Yellow Sea SST data set, as well as the South China Sea SST data set, using SST data from the past 35 days to predict SST data for 1, 3, and 7 days in the future. The model was trained using data spanning from 2010 to 2021, with data from 2022 being utilized to assess the model's predictive performance. The experimental results show that the model proposed in this research paper achieves excellent results at different prediction scales in both sea areas, and the model consistently outperforms other methods. Specifically, in the Bohai Sea and Yellow Sea sea areas, when the prediction scales are 1, 3, and 7 days, the MAE of ST-UNet outperforms the best results of the other three compared models by 17%, 12%, and 2%, and the MSE by 16%, 18%, and 9%, respectively. In the South China Sea, when the prediction ranges are 1, 3, and 7 days, the MAE of ST-UNet is 27%, 18%, and 3% higher than the best of the other three compared models, and the MSE is 46%, 39%, and 16% higher, respectively. Our results highlight the effectiveness of the ST-UNet model in capturing spatial correlations and accurately predicting SST. The proposed model is expected to improve marine hydrographic studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. A Renovated Framework of a Convolution Neural Network with Transformer for Detecting Surface Changes from High-Resolution Remote-Sensing Images.
- Author
-
Yao, Shunyu, Wang, Han, Su, Yalu, Li, Qing, Sun, Tao, Liu, Changjun, Li, Yao, and Cheng, Deqiang
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *SURFACE of the earth , *FEATURE extraction , *REMOTE sensing - Abstract
Natural hazards are considered to have a strong link with climate change and human activities. With the rapid advancements in remote sensing technology, real-time monitoring and high-resolution remote-sensing images have become increasingly available, which provide precise details about the Earth's surface and enable prompt updates to support risk identification and management. This paper proposes a new network framework with Transformer architecture and a Residual network for detecting the changes in high-resolution remote-sensing images. The proposed model is trained using remote-sensing images from Shandong and Anhui Provinces of China in 2021 and 2022 while one district in 2023 is used to test the prediction accuracy. The performance of the proposed model is evaluated by using five matrices and further compared to both convention-based and attention-based models. The results demonstrated that the proposed structure integrates the great capability of conventional neural networks for image feature extraction with the ability to obtain global context from the attention mechanism, resulting in significant improvements in balancing positive sample identification while avoiding false positives in complex image change detection. Additionally, a toolkit supporting image preprocessing is developed for practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Object Identification in Land Parcels Using a Machine Learning Approach.
- Author
-
Gundermann, Niels, Löwe, Welf, Fransson, Johan E. S., Olofsson, Erika, and Wehrenpfennig, Andreas
- Subjects
- *
MACHINE learning , *CONVOLUTIONAL neural networks , *IMAGE recognition (Computer vision) , *ARTIFICIAL intelligence , *LAND use - Abstract
This paper introduces an AI-based approach to detect human-made objects and changes in these on land parcels. To this end, we used binary image classification performed by a convolutional neural network. Binary classification requires the selection of a decision boundary, and we provided a deterministic method for this selection. Furthermore, we varied different parameters to improve the performance of our approach, leading to a true positive rate of 91.3% and a true negative rate of 63.0%. A specific application of our work supports the administration of agricultural land parcels eligible for subsidiaries. As a result of our findings, authorities could reduce the effort involved in the detection of human made changes by approximately 50%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Ship Detection with Deep Learning in Optical Remote-Sensing Images: A Survey of Challenges and Advances.
- Author
-
Zhao, Tianqi, Wang, Yongcheng, Li, Zheng, Gao, Yunxiao, Chen, Chi, Feng, Hao, and Zhao, Zhikang
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *OPTICAL remote sensing , *OPTICAL images , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *FEATURE extraction - Abstract
Ship detection aims to automatically identify whether there are ships in the images, precisely classifies and localizes them. Regardless of whether utilizing early manually designed methods or deep learning technology, ship detection is dedicated to exploring the inherent characteristics of ships to enhance recall. Nowadays, high-precision ship detection plays a crucial role in civilian and military applications. In order to provide a comprehensive review of ship detection in optical remote-sensing images (SDORSIs), this paper summarizes the challenges as a guide. These challenges include complex marine environments, insufficient discriminative features, large scale variations, dense and rotated distributions, large aspect ratios, and imbalances between positive and negative samples. We meticulously review the improvement methods and conduct a detailed analysis of the strengths and weaknesses of these methods. We compile ship information from common optical remote sensing image datasets and compare algorithm performance. Simultaneously, we compare and analyze the feature extraction capabilities of backbones based on CNNs and Transformer, seeking new directions for the development in SDORSIs. Promising prospects are provided to facilitate further research in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Joint Retrieval of Multiple Species of Ice Hydrometeor Parameters from Millimeter and Submillimeter Wave Brightness Temperature Based on Convolutional Neural Networks.
- Author
-
Chen, Ke, Wu, Jiasheng, and Chen, Yingying
- Subjects
- *
SUBMILLIMETER waves , *CONVOLUTIONAL neural networks , *BRIGHTNESS temperature , *MILLIMETER waves , *MONTE Carlo method , *ASTROCHEMISTRY - Abstract
Submillimeter wave radiometers are promising remote sensing tools for sounding ice cloud parameters. The Ice Cloud Imager (ICI) aboard the second generation of the EUMETSAT Polar System (EPS−SG) is the first operational submillimeter wave radiometer used for ice cloud remote sensing. Ice clouds simultaneously contain three species of ice hydrometeors—ice, snow, and graupel—the physical distributions and submillimeter wave radiation characteristics of which differ. Therefore, jointly retrieving the mass parameters of the three ice hydrometeors from submillimeter brightness temperatures is very challenging. In this paper, we propose a multiple species of ice hydrometeor parameters retrieval algorithm based on convolutional neural networks (CNNs) that can jointly retrieve the total content and vertical profiles of ice, snow, and graupel particles from submillimeter brightness temperatures. The training dataset is generated by a numerical weather prediction (NWP) model and a submillimeter wave radiative transfer (RT) model. In this study, an end to end ICI simulation experiment involving forward modeling of the brightness temperature and retrieval of ice cloud parameters was conducted to verify the effectiveness of the proposed CNN retrieval algorithm. Compared with the classical Unet, the average relative errors of the improved RCNN–ResUnet are reduced by 11%, 25%, and 18% in GWP, IWP, and SWP retrieval, respectively. Compared with Bayesian Monte Carlo integration algorithm, the average relative error of the total content retrieved by RCNN–ResUnet is reduced by 71%. Compared with BP neural network algorithm, the average relative error of the vertical profiles retrieved by RCNN–ResUnet is reduced by 69%. In addition, this algorithm was applied to actual Advanced Technology Microwave Sounder (ATMS) 183 GHz observed brightness temperatures to retrieve graupel particle parameters with a relative error in the total content of less than 25% and a relative error in the profile of less than 35%. The results show that the proposed CNN algorithm can be applied to future space borne submillimeter wave radiometers to jointly retrieve mass parameters of ice, snow, and graupel. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. PolSAR Image Classification with Active Complex-Valued Convolutional-Wavelet Neural Network and Markov Random Fields.
- Author
-
Liu, Lu and Li, Yongxiang
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *SPECKLE interference , *MARKOV random fields , *WAVELET transforms , *ACTIVE learning - Abstract
PolSAR image classification has attracted extensive significant research in recent decades. Aiming at improving PolSAR classification performance with speckle noise, this paper proposes an active complex-valued convolutional-wavelet neural network by incorporating dual-tree complex wavelet transform (DT-CWT) and Markov random field (MRF). In this approach, DT-CWT is introduced into the complex-valued convolutional neural network to suppress the speckle noise of PolSAR images and maintain the structures of learned feature maps. In addition, by applying active learning (AL), we iteratively select the most informative unlabeled training samples of PolSAR datasets. Moreover, MRF is utilized to obtain spatial local correlation information, which has been proven to be effective in improving classification performance. The experimental results on three benchmark PolSAR datasets demonstrate that the proposed method can achieve a significant classification performance gain in terms of its effectiveness and robustness beyond some state-of-the-art deep learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Forest Aboveground Biomass Estimation Using Multisource Remote Sensing Data and Deep Learning Algorithms: A Case Study over Hangzhou Area in China.
- Author
-
Tian, Xin, Li, Jiejie, Zhang, Fanyi, Zhang, Haibo, and Jiang, Mi
- Subjects
- *
DEEP learning , *BIOMASS estimation , *MACHINE learning , *MULTISPECTRAL imaging , *REMOTE sensing , *FOREST biomass , *CONVOLUTIONAL neural networks , *SYNTHETIC aperture radar - Abstract
The accurate estimation of forest aboveground biomass is of great significance for forest management and carbon balance monitoring. Remote sensing instruments have been widely applied in forest parameters inversion with wide coverage and high spatiotemporal resolution. In this paper, the capability of different remote-sensed imagery was investigated, including multispectral images (GaoFen-6, Sentinel-2 and Landsat-8) and various SAR (Synthetic Aperture Radar) data (GaoFen-3, Sentinel-1, ALOS-2), in aboveground forest biomass estimation. In particular, based on the forest inventory data of Hangzhou in China, the Random Forest (RF), Convolutional Neural Network (CNN) and Convolutional Neural Networks Long Short-Term Memory Networks (CNN-LSTM) algorithms were deployed to construct the forest biomass estimation models, respectively. The estimate accuracies were evaluated under the different configurations of images and methods. The results show that for the SAR data, ALOS-2 has a higher biomass estimation accuracy than the GaoFen-3 and Sentinel-1. Moreover, the GaoFen-6 data is slightly worse than Sentinel-2 and Landsat-8 optical data in biomass estimation. In contrast with the single source, integrating multisource data can effectively enhance accuracy, with improvements ranging from 5% to 10%. The CNN-LSTM generally performs better than CNN and RF, regardless of the data used. The combination of CNN-LSTM and multisource data provided the best results in this case and can achieve the maximum R2 value of up to 0.74. It was found that the majority of the biomass values in the study area in 2018 ranged from 60 to 90 Mg/ha, with an average value of 64.20 Mg/ha. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. CroplandCDNet: Cropland Change Detection Network for Multitemporal Remote Sensing Images Based on Multilayer Feature Transmission Fusion of an Adaptive Receptive Field.
- Author
-
Wu, Qiang, Huang, Liang, Tang, Bo-Hui, Cheng, Jiapei, Wang, Meiqi, and Zhang, Zixuan
- Subjects
- *
CONVOLUTIONAL neural networks , *CHANGE-point problems , *FARMS , *MARKOV random fields , *REMOTE-sensing images , *FEATURE extraction - Abstract
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the deep features of cropland, the accumulation of irrelevant features and the loss of key features will lead to poor detection results. To effectively solve this problem, a novel cropland change detection network (CroplandCDNet) is proposed in this paper; this network combines an adaptive receptive field and multiscale feature transmission fusion to achieve accurate detection of cropland change information. CroplandCDNet first effectively extracts the multiscale features of cropland from bitemporal remote sensing images through the feature extraction module and subsequently embeds the receptive field adaptive SK attention (SKA) module to emphasize cropland change. Moreover, the SKA module effectively uses spatial context information for the dynamic adjustment of the convolution kernel size of cropland features at different scales. Finally, multiscale features and difference features are transmitted and fused layer by layer to obtain the content of cropland change. In the experiments, the proposed method is compared with six advanced change detection methods using the cropland change detection dataset (CLCD). The experimental results show that CroplandCDNet achieves the best F1 and OA at 76.04% and 94.47%, respectively. Its precision and recall are second best of all models at 76.46% and 75.63%, respectively. Moreover, a generalization experiment was carried out using the Jilin-1 dataset, which effectively verified the reliability of CroplandCDNet in cropland change detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. AIDB-Net: An Attention-Interactive Dual-Branch Convolutional Neural Network for Hyperspectral Pansharpening.
- Author
-
Sun, Qian, Sun, Yu, and Pan, Chengsheng
- Subjects
- *
CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Despite notable advancements achieved on Hyperspectral (HS) pansharpening tasks through deep learning techniques, previous methods are inherently constrained by convolution or self-attention intrinsic defects, leading to limited performance. In this paper, we proposed an Attention-Interactive Dual-Branch Convolutional Neural Network (AIDB-Net) for HS pansharpening. Our model purely consists of convolutional layers and simultaneously inherits the strengths of both convolution and self-attention, especially the modeling of short- and long-range dependencies. Specially, we first extract, tokenize, and align the hyperspectral image (HSI) and panchromatic image (PAN) by Overlapping Patch Embedding Blocks. Then, we specialize a novel Spectral-Spatial Interactive Attention which is able to globally interact and fuse the cross-modality features. The resultant token-global similarity scores can guide the refinement and renewal of the textural details and spectral characteristics within HSI features. By deeply combined these two paradigms, our AIDB-Net significantly improve the pansharpening performance. Moreover, with the acceleration by the convolution inductive bias, our interactive attention can be trained without large scale dataset and achieves competitive time cost with its counterparts. Compared with the state-of-the-art methods, our AIDB-Net makes 5.2%, 3.1%, and 2.2% improvement on PSNR metric on three public datasets, respectively. Comprehensive experiments quantitatively and qualitatively demonstrate the effectiveness and superiority of our AIDB-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Learning Point Processes and Convolutional Neural Networks for Object Detection in Satellite Images.
- Author
-
Mabon, Jules, Ortner, Mathias, and Zerubia, Josiane
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *POINT processes , *REMOTE-sensing images , *GABOR filters , *ARTIFICIAL satellites - Abstract
Convolutional neural networks (CNN) have shown great results for object-detection tasks by learning texture and pattern-extraction filters. However, object-level interactions are harder to grasp without increasing the complexity of the architectures. On the other hand, Point Process models propose to solve the detection of the configuration of objects as a whole, allowing the factoring in of the image data and the objects' prior interactions. In this paper, we propose combining the information extracted by a CNN with priors on objects within a Markov Marked Point Process framework. We also propose a method to learn the parameters of this Energy-Based Model. We apply this model to the detection of small vehicles in optical satellite imagery, where the image information needs to be complemented with object interaction priors because of noise and small object sizes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. A Lightning Classification Method Based on Convolutional Encoding Features.
- Author
-
Zhu, Shunxing, Zhang, Yang, Fan, Yanfeng, Sun, Xiubin, Zheng, Dong, Zhang, Yijun, Lyu, Weitao, Zhang, Huiyi, and Wang, Jingxuan
- Subjects
- *
CONVOLUTIONAL neural networks , *RANDOM forest algorithms , *THUNDERSTORMS - Abstract
At present, for business lightning positioning systems, the classification of lightning discharge types is mostly based on lightning pulse signal features, and there is still a lot of room for improvement. We propose a lightning discharge classification method based on convolutional encoding features. This method utilizes convolutional neural networks to extract encoding features, and uses random forests to classify the extracted encoding features, achieving high accuracy discrimination for various lightning discharge events. Compared with traditional multi-parameter-based methods, the new method proposed in this paper has the ability to identify multiple lightning discharge events and does not require precise detailed feature engineering to extract individual pulse parameters. The accuracy of this method for identifying lightning discharge types in intra-cloud flash (IC), cloud-to-ground flash (CG), and narrow bipolar events (NBEs) is 97%, which is higher than that of multi-parameter methods. Moreover, our method can complete the classification task of lightning signals at a faster speed. Under the same conditions, the new method only requires 28.2 µs to identify one pulse, while deep learning-based methods require 300 µs. This method has faster recognition speed and higher accuracy in identifying multiple discharge types, which can better meet the needs of real-time business positioning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. An Overlay Accelerator of DeepLab CNN for Spacecraft Image Segmentation on FPGA.
- Author
-
Guo, Zibo, Liu, Kai, Liu, Wei, Sun, Xiaoyao, Ding, Chongyang, and Li, Shangrong
- Subjects
- *
IMAGE segmentation , *COMPILERS (Computer programs) , *SPACE vehicles , *CONVOLUTIONAL neural networks , *FIELD programmable gate arrays , *INSTRUCTION set architecture - Abstract
Due to the absence of communication and coordination with external spacecraft, non-cooperative spacecraft present challenges for the servicing spacecraft in acquiring information about their pose and location. The accurate segmentation of non-cooperative spacecraft components in images is a crucial step in autonomously sensing the pose of non-cooperative spacecraft. This paper presents a novel overlay accelerator of DeepLab Convolutional Neural Networks (CNNs) for spacecraft image segmentation on a FPGA. First, several software–hardware co-design aspects are investigated: (1) A CNNs-domain COD instruction set (Control, Operation, Data Transfer) is presented based on a Load–Store architecture to enable the implementation of accelerator overlays. (2) An RTL-based prototype accelerator is developed for the COD instruction set. The accelerator incorporates dedicated units for instruction decoding and dispatch, scheduling, memory management, and operation execution. (3) A compiler is designed that leverages tiling and operation fusion techniques to optimize the execution of CNNs, generating binary instructions for the optimized operations. Our accelerator is implemented on a Xilinx Virtex-7 XC7VX690T FPGA at 200 MHz. Experiments demonstrate that with INT16 quantization our accelerator achieves an accuracy (mIoU) of 77.84%, experiencing only a 0.2% degradation compared to that of the original fully precision model, in accelerating the segmentation model of DeepLabv3+ ResNet18 on the spacecraft component images (SCIs) dataset. The accelerator boasts a performance of 184.19 GOPS/s and a computational efficiency (Runtime Throughput/Theoretical Roof Throughput) of 88.72%. Compared to previous work, our accelerator improves performance by 1.5× and computational efficiency by 43.93%, all while consuming similar hardware resources. Additionally, in terms of instruction encoding, our instructions reduce the size by 1.5× to 49× when compiling the same model compared to previous work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Hyperspectral Image Classification Based on Mutually Guided Image Filtering.
- Author
-
Zhan, Ying, Hu, Dan, Yu, Xianchuan, and Wang, Yufeng
- Subjects
- *
IMAGE recognition (Computer vision) , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *FEATURE extraction , *GENERATIVE adversarial networks , *HYPERSPECTRAL imaging systems , *REMOTE sensing - Abstract
Hyperspectral remote sensing images (HSIs) have both spectral and spatial characteristics. The adept exploitation of these attributes is central to enhancing the classification accuracy of HSIs. In order to effectively utilize spatial and spectral features to classify HSIs, this paper proposes a method for the spatial feature extraction of HSIs based on a mutually guided image filter (muGIF) and combined with the band-distance-grouped principal component. Firstly, aiming at the problem that previously guided image filtering cannot effectively deal with the inconsistent information structure between the guided and target information, a method for extracting spatial features using muGIF is proposed. Then, aiming at the problem of the information loss caused by a single principal component as a guided image in the traditional GIF-based spatial–spectral classification, a spatial feature-extraction framework based on the band-distance-grouped principal component is proposed. The method groups the bands according to the band distance and extracts the principal components of each set of band subsets as the guide map of the current band subset to filter the HSIs. A deep convolutional neural network model and a generative adversarial network model for the filtered HSIs are constructed and then trained using samples for HSIs' spatial–spectral classification. Experiments show that compared with the traditional methods and several popular spatial–spectral HSI classification methods based on a filter, the proposed methods based on muGIF can effectively extract the spatial–spectral features and improve the classification accuracy of HSIs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Transfer-Learning-Based Human Activity Recognition Using Antenna Array.
- Author
-
Ye, Kun, Wu, Sheng, Cai, Yongbin, Zhou, Lang, Xiao, Lijun, Zhang, Xuebo, Zheng, Zheng, and Lin, Jiaqing
- Subjects
- *
HUMAN activity recognition , *ANTENNA arrays , *CONVOLUTIONAL neural networks , *ARRAY processing - Abstract
Due to its low cost and privacy protection, Channel-State-Information (CSI)-based activity detection has gained interest recently. However, to achieve high accuracy, which is challenging in practice, a significant number of training samples are required. To address the issues of the small sample size and cross-scenario in neural network training, this paper proposes a WiFi human activity-recognition system based on transfer learning using an antenna array: Wi-AR. First, the Intel5300 network card collects CSI signal measurements through an antenna array and processes them with a low-pass filter to reduce noise. Then, a threshold-based sliding window method is applied to extract the signal of independent activities, which is further transformed into time–frequency diagrams. Finally, the produced diagrams are used as input to a pretrained ResNet18 to recognize human activities. The proposed Wi-AR was evaluated using a dataset collected in three different room layouts. The testing results showed that the suggested Wi-AR recognizes human activities with a consistent accuracy of about 94%, outperforming the other conventional convolutional neural network approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Remote Sensing for Maritime Monitoring and Vessel Identification.
- Author
-
Salerno, Emanuele, Di Paola, Claudio, and Lo Duca, Angelica
- Subjects
- *
DEEP learning , *REMOTE sensing , *CONVOLUTIONAL neural networks , *SURVEILLANCE radar , *SYNTHETIC aperture radar , *INFORMATION technology , *PATTERN recognition systems - Abstract
This document explores the significance of remote sensing in monitoring maritime activities and identifying vessels. It emphasizes the need for surveillance to ensure safety, security, and emergency management, given the increasing number of vessels worldwide. The document highlights the use of technologies like the Automatic Identification System (AIS) and remote sensing in situations where collaborative systems are not reliable. It also discusses the integration of data from different sensors and the application of data science techniques for a comprehensive assessment of maritime traffic. The document concludes by summarizing research papers on ship detection, tracking, and classification using various sensors and data processing techniques. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
48. Multi-View Scene Classification Based on Feature Integration and Evidence Decision Fusion.
- Author
-
Zhou, Weixun, Shi, Yongxin, and Huang, Xiao
- Subjects
- *
FEATURE extraction , *IMAGE recognition (Computer vision) , *IMAGE fusion , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Leveraging multi-view remote sensing images in scene classification tasks significantly enhances the precision of such classifications. This approach, however, poses challenges due to the simultaneous use of multi-view images, which often leads to a misalignment between the visual content and semantic labels, thus complicating the classification process. In addition, as the number of image viewpoints increases, the quality problem for remote sensing images further limits the effectiveness of multi-view image classification. Traditional scene classification methods predominantly employ SoftMax deep learning techniques, which lack the capability to assess the quality of remote sensing images or to provide explicit explanations for the network's predictive outcomes. To address these issues, this paper introduces a novel end-to-end multi-view decision fusion network specifically designed for remote sensing scene classification. The network integrates information from multi-view remote sensing images under the guidance of image credibility and uncertainty, and when the multi-view image fusion process encounters conflicts, it greatly alleviates the conflicts and provides more reasonable and credible predictions for the multi-view scene classification results. Initially, multi-scale features are extracted from the multi-view images using convolutional neural networks (CNNs). Following this, an asymptotic adaptive feature fusion module (AAFFM) is constructed to gradually integrate these multi-scale features. An adaptive spatial fusion method is then applied to assign different spatial weights to the multi-scale feature maps, thereby significantly enhancing the model's feature discrimination capability. Finally, an evidence decision fusion module (EDFM), utilizing evidence theory and the Dirichlet distribution, is developed. This module quantitatively assesses the uncertainty in the multi-perspective image classification process. Through the fusing of multi-perspective remote sensing image information in this module, a rational explanation for the prediction results is provided. The efficacy of the proposed method was validated through experiments conducted on the AiRound and CV-BrCT datasets. The results show that our method not only improves single-view scene classification results but also advances multi-view remote sensing scene classification results by accurately characterizing the scene and mitigating the conflicting nature of the fusion process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. A CFAR-Enhanced Ship Detector for SAR Images Based on YOLOv5s.
- Author
-
Wen, Xue, Zhang, Shaoming, Wang, Jianmei, Yao, Tangjun, and Tang, Yan
- Subjects
- *
IMAGE recognition (Computer vision) , *IMAGE converters , *SYNTHETIC aperture radar , *TRAFFIC monitoring , *CONVOLUTIONAL neural networks , *RESEARCH vessels , *IMAGE analysis - Abstract
Ship detection and recognition in Synthetic Aperture Radar (SAR) images are crucial for maritime surveillance and traffic management. Limited availability of high-quality datasets hinders in-depth exploration of ship features in complex SAR images. While most existing SAR ship research is primarily based on Convolutional Neural Networks (CNNs), and although deep learning advances SAR image interpretation, it often prioritizes recognition over computational efficiency and underutilizes SAR image prior information. Therefore, this paper proposes YOLOv5s-based ship detection in SAR images. Firstly, for comprehensive detection enhancement, we employ the lightweight YOLOv5s model as the baseline. Secondly, we introduce a sub-net into YOLOv5s, learning traditional features to augment ship feature representation of Constant False Alarm Rate (CFAR). Additionally, we attempt to incorporate frequency-domain information into the channel attention mechanism to further improve detection. Extensive experiments on the Ship Recognition and Detection Dataset (SRSDDv1.0) in complex SAR scenarios confirm our method's 68.04% detection accuracy and 60.25% recall, with a compact 18.51 M model size. Our network surpasses peers in mAP, F1 score, model size, and inference speed, displaying robustness across diverse complex scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. TransHSI: A Hybrid CNN-Transformer Method for Disjoint Sample-Based Hyperspectral Image Classification.
- Author
-
Zhang, Ping, Yu, Haiyang, Li, Pengao, and Wang, Ruili
- Subjects
- *
IMAGE recognition (Computer vision) , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *CLASSIFICATION algorithms , *MULTISENSOR data fusion , *FEATURE extraction - Abstract
Hyperspectral images' (HSIs) classification research has seen significant progress with the use of convolutional neural networks (CNNs) and Transformer blocks. However, these studies primarily incorporated Transformer blocks at the end of their network architectures. Due to significant differences between the spectral and spatial features in HSIs, the extraction of both global and local spectral–spatial features remains incomplete. To address this challenge, this paper introduces a novel method called TransHSI. This method incorporates a new spectral–spatial feature extraction module that leverages 3D CNNs to fuse Transformer to extract the local and global spectral features of HSIs, then combining 2D CNNs and Transformer to capture the local and global spatial features of HSIs comprehensively. Furthermore, a fusion module is proposed, which not only integrates the learned shallow and deep features of HSIs but also applies a semantic tokenizer to transform the fused features, enhancing the discriminative power of the features. This paper conducts experiments on three public datasets: Indian Pines, Pavia University, and Data Fusion Contest 2018. The training and test sets are selected based on a disjoint sampling strategy. We perform a comparative analysis with 11 traditional and advanced HSI classification algorithms. The experimental results demonstrate that the proposed method, TransHSI algorithm, achieves the highest overall accuracies and kappa coefficients, indicating a competitive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.