172 results on '"Geo-Localization"'
Search Results
2. Aerial-view geo-localization based on multi-layer local pattern cross-attention network.
- Author
-
Li, Haoran, Wang, Tingyu, Chen, Quan, Zhao, Qiang, Jiang, Shaowei, Yan, Chenggang, and Zheng, Bolun
- Subjects
DEEP learning ,GEOTAGGING ,DATABASES ,REMOTE-sensing images - Abstract
Aerial-view geo-localization aims to determine locations of interest to drones by matching drone-view images against a satellite database with geo-tagging. The key underpinning of this task is to mine discriminative features to form a view-invariant representation of the same target location. To achieve this purpose, existing methods usually focus on extracting fine-grained information from the final feature map while neglecting the importance of middle-layer outputs. In this work, we propose a Transformer-based network, named Multi-layer Local Pattern Cross Attention Network (MLPCAN). Particularly, we employ the cross-attention block (CAB) to establish correlations between information of feature maps from different layers when images are fed into the network. Then, we apply the square-ring partition strategy to divide feature maps from different layers and acquire multiple local pattern blocks. For the information misalignment within multi-layer features, we propose the multi-layer aggregation block (MAB) to aggregate the high-association feature blocks obtained by the division. Extensive experiments on two public datasets, i.e., University-1652 and SUES-200, show that the proposed model significantly improves the accuracy of geo-localization and achieves competitive results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. A Contrastive Learning Based Multiview Scene Matching Method for UAV View Geo-Localization.
- Author
-
He, Qiyi, Xu, Ao, Zhang, Yifan, Ye, Zhiwei, Zhou, Wen, Xi, Ruijie, and Lin, Qiao
- Subjects
- *
TRANSFORMER models , *REMOTE-sensing images , *DRONE aircraft , *FEATURE extraction , *IMAGE retrieval - Abstract
Multi-view scene matching refers to the establishment of a mapping relationship between images captured from different perspectives, such as those taken by unmanned aerial vehicles (UAVs) and satellites. This technology is crucial for the geo-localization of UAV views. However, the geometric discrepancies between images from different perspectives, combined with the inherent computational constraints of UAVs, present significant challenges for matching UAV and satellite images. Additionally, the imbalance of positive and negative samples between drone and satellite images during model training can lead to instability. To address these challenges, this study proposes a novel and efficient cross-view geo-localization framework called MSM-Transformer. The framework employs the Dual Attention Vision Transformer (DaViT) as the core architecture for feature extraction, which significantly enhances the modeling capacity for global features and the contextual relevance of adjacent regions. The weight-sharing mechanism in MSM-Transformer effectively reduces model complexity, making it highly suitable for deployment on embedded devices such as UAVs and satellites. Furthermore, the framework introduces a contrastive learning-based Symmetric Decoupled Contrastive Learning (DCL) loss function, which effectively mitigates the issue of sample imbalance between satellite and UAV images. Experimental validation on the University-1652 dataset demonstrates that MSM-Transformer achieves outstanding performance, delivering optimal matching results with a minimal number of parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. AST: An Attention-Guided Segment Transformer for Drone-Based Cross-View Geo-Localization
- Author
-
Zhao, Zichuan, Tang, Tianhang, Chen, Jie, Shi, Xuelei, Liu, Yiguang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Zhang, Fang-Lue, editor, and Sharf, Andrei, editor
- Published
- 2024
- Full Text
- View/download PDF
5. A Cross-View Geo-Localization Algorithm Using UAV Image and Satellite Image.
- Author
-
Fan, Jiqi, Zheng, Enhui, He, Yufei, and Yang, Jianxing
- Subjects
- *
REMOTE-sensing images , *TRANSFORMER models , *ALGORITHMS , *TECHNOLOGY transfer , *MACHINE learning - Abstract
Within research on the cross-view geolocation of UAVs, differences in image sources and interference from similar scenes pose huge challenges. Inspired by multimodal machine learning, in this paper, we design a single-stream pyramid transformer network (SSPT). The backbone of the model uses the self-attention mechanism to enrich its own internal features in the early stage and uses the cross-attention mechanism in the later stage to refine and interact with different features to eliminate irrelevant interference. In addition, in the post-processing part of the model, a header module is designed for upsampling to generate heat maps, and a Gaussian weight window is designed to assign label weights to make the model converge better. Together, these methods improve the positioning accuracy of UAV images in satellite images. Finally, we also use style transfer technology to simulate various environmental changes in order to expand the experimental data, further proving the environmental adaptability and robustness of the method. The final experimental results show that our method yields significant performance improvement: The relative distance score (RDS) of the SSPT-384 model on the benchmark UL14 dataset is significantly improved from 76.25% to 84.40%, while the meter-level accuracy (MA) of 3 m, 5 m, and 20 m is increased by 12%, 12%, and 10%, respectively. For the SSPT-256 model, the RDS has been increased to 82.21%, and the meter-level accuracy (MA) of 3 m, 5 m, and 20 m has increased by 5%, 5%, and 7%, respectively. It still shows strong robustness on the extended thermal infrared (TIR), nighttime, and rainy day datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. IML-Net: A Framework for Cross-View Geo-Localization with Multi-Domain Remote Sensing Data.
- Author
-
Yan, Yiming, Wang, Mengyuan, Su, Nan, Hou, Wei, Zhao, Chunhui, and Wang, Wenxuan
- Subjects
- *
REMOTE sensing , *WIRELESS geolocation systems , *FEATURE extraction , *IMAGE reconstruction , *NETWORK performance , *WEATHER , *SHARED workspaces - Abstract
Cross-view geolocation is a valuable yet challenging task. In practical applications, the images targeted by cross-view geolocation technology encompass multi-domain remote sensing images, including those from different platforms (e.g., drone cameras and satellites), different perspectives (e.g., nadir and oblique), and different temporal conditions (e.g., various seasons and weather conditions). Based on the characteristics of these images, we have designed an effective framework, Image Reconstruction and Multi-Unit Mutual Learning Net (IML-Net), for accomplishing cross-view geolocation tasks. By incorporating a deconvolutional network into the architecture to reconstruct images, we can better bridge the differences in remote sensing image features across different domains. This enables the mapping of target images from different platforms and perspectives into a shared latent space representation, obtaining more discriminative feature descriptors. The process enhances the robustness of feature extraction for locating targets across a wide range of perspectives. To improve the network's performance, we introduce attention regions learned from different units as augmented data during the training process. For the current cross-view geolocation datasets, the use of large-scale datasets is limited due to high costs and privacy concerns, leading to the prevalent use of simulated data. However, real data allow the network to learn more generalizable features. To make the model more robust and stable, we collected two groups of multi-domain datasets from the Zurich and Harbin regions, incorporating real data into the cross-view geolocation task to construct the ZHcity750 Dataset. Our framework is evaluated on the cross-domain ZHcity750 Dataset, which shows competitive results compared to state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Image and Object Geo-Localization.
- Author
-
Wilson, Daniel, Zhang, Xiaohan, Sultani, Waqas, and Wshah, Safwan
- Subjects
- *
DEEP learning , *MACHINE learning , *GLOBAL Positioning System , *LOCALIZATION (Mathematics) , *REMOTE-sensing images , *AUGMENTED reality , *ROBOTICS - Abstract
The concept of geo-localization broadly refers to the process of determining an entity's geographical location, typically in the form of Global Positioning System (GPS) coordinates. The entity of interest may be an image, a sequence of images, a video, a satellite image, or even objects visible within the image. Recently, massive datasets of GPS-tagged media have become available due to smartphones and the internet, and deep learning has risen to prominence and enhanced the performance capabilities of machine learning models. These developments have enabled the rise of image and object geo-localization, which has impacted a wide range of applications such as augmented reality, robotics, self-driving vehicles, road maintenance, and 3D reconstruction. This paper provides a comprehensive survey of visual geo-localization, which may involve either determining the location at which an image has been captured (image geo-localization) or geolocating objects within an image (object geo-localization). We will provide an in-depth study of visual geo-localization including a summary of popular algorithms, a description of proposed datasets, and an analysis of performance results to illustrate the current state of the field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. UAV Geo-Localization for Navigation: A Survey
- Author
-
Danilo Avola, Luigi Cinque, Emad Emam, Federico Fontana, Gian Luca Foresti, Marco Raoul Marini, Alessio Mecca, and Daniele Pannone
- Subjects
UAV ,geo-localization ,navigation ,guidance ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
During the flight, Unmanned Aerial Vehicles (UAVs) usually exploit internal sensors to determine their position. The most useful and used one is the Global Positioning System (GPS), or, more in general, any Global Navigation Satellite System (GNSS). Modern GPSs provide the correct device’s location with a few meters of displacement, especially in a scenario with good weather and open sky. However, the lack of these optimal conditions highly impacts the accuracy. Moreover, in restricted areas or fields of war, several anti-drone techniques are applied to limit UAVs capabilities. Without proper counter solutions, UAVs cannot continue their task and sometimes are not even able to come back since they are not aware of their position. During the last years, plenty of techniques have been developed to provide UAVs with a knowledge of their location that is not strictly connected to the availability of the GPS sensor. This research field is commonly called Geo-Localization and can be considered one of the hot topics of UAV research. Moreover, research is going further, trying to provide UAVs with fully autonomous navigation systems that do not use hijackable sensors. This survey aims to provide a quick guide to the newest and more promising methodologies for UAV Geo-Localization for Navigation tasks, showing the differences and the related application fields.
- Published
- 2024
- Full Text
- View/download PDF
9. SeGCN: A Semantic-Aware Graph Convolutional Network for UAV Geo-Localization
- Author
-
Xiangzeng Liu, Ziyao Wang, Yue Wu, and Qiguang Miao
- Subjects
Geo-localization ,graph convolutional networks (GCNs) ,sementic inference ,transformer ,unmanned aerial vehicle (UAV) navigation ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Cross-view geo-localization via scene matching is crucial in unmanned aerial vehicle (UAV) systems in global navigation satellite system denial environment. However, images in the same scene may undergo geometric distortion and occlusion due to differences in capture viewpoint, time, and platform. The existing methods mainly extract consistent features between images by CNNs, while ignoring the semantic distribution and structural information of the objects. Aiming at addressing this issue, we introduce a semantic-aware graph convolutional network (SeGCN). To improve consistent representation of object features from different viewpoints, potential semantic features are inferred via cross-attention of image context. Then, for exploring the structural information of objects, SeGCN performs graph convolution on graph structures constructed from the same semantic features. Finally, the composite features generated by SeGCN and backbone are utilized for scene matching. Comprehensive experiments conducted on the University-1652 and SUES-200 benchmarks establish that the proposed approach attains the highest levels of accuracy in both localization and navigation tasks. Furthermore, we conducted localization simulation experiments on our real UAV datasets, confirming the effectiveness of SeGCN in real world application scenarios.
- Published
- 2024
- Full Text
- View/download PDF
10. Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments.
- Author
-
Dai, Ming, Zheng, Enhui, Feng, Zhenhua, Qi, Lei, Zhuang, Jiedong, and Yang, Wankou
- Subjects
- *
TRANSFORMER models , *TELECOMMUNICATION satellites , *CONVOLUTIONAL neural networks , *CITIES & towns , *DRONE aircraft - Abstract
Unmanned Aerial Vehicles (UAVs) rely on satellite systems for stable positioning. However, due to limited satellite coverage or communication disruptions, UAVs may lose signals for positioning. In such situations, vision-based techniques can serve as an alternative, ensuring the self-positioning capability of UAVs. However, most of the existing datasets are developed for the geo-localization task of the objects captured by UAVs, rather than UAV self-positioning. Furthermore, the existing UAV datasets apply discrete sampling to synthetic data, such as Google Maps, neglecting the crucial aspects of dense sampling and the uncertainties commonly experienced in practical scenarios. To address these issues, this paper presents a new dataset, DenseUAV, that is the first publicly available dataset tailored for the UAV self-positioning task. DenseUAV adopts dense sampling on UAV images obtained in low-altitude urban areas. In total, over 27K UAV- and satellite-view images of 14 university campuses are collected and annotated. In terms of methodology, we first verify the superiority of Transformers over CNNs for the proposed task. Then we incorporate metric learning into representation learning to enhance the model’s discriminative capacity and to reduce the modality discrepancy. Besides, to facilitate joint learning from both the satellite and UAV views, we introduce a mutually supervised learning approach. Last, we enhance the Recall@K metric and introduce a new measurement, SDM@K, to evaluate both the retrieval and localization performance for the proposed task. As a result, the proposed baseline method achieves a remarkable Recall@1 score of 83.01% and an SDM@1 score of 86.50% on DenseUAV. The dataset and code have been made publicly available on https://github.com/Dmmm1997/DenseUAV. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. A Contrastive Learning Based Multiview Scene Matching Method for UAV View Geo-Localization
- Author
-
Qiyi He, Ao Xu, Yifan Zhang, Zhiwei Ye, Wen Zhou, Ruijie Xi, and Qiao Lin
- Subjects
geo-localization ,contrastive learning ,multi-view scene matching ,transformer ,image retrieval ,Science - Abstract
Multi-view scene matching refers to the establishment of a mapping relationship between images captured from different perspectives, such as those taken by unmanned aerial vehicles (UAVs) and satellites. This technology is crucial for the geo-localization of UAV views. However, the geometric discrepancies between images from different perspectives, combined with the inherent computational constraints of UAVs, present significant challenges for matching UAV and satellite images. Additionally, the imbalance of positive and negative samples between drone and satellite images during model training can lead to instability. To address these challenges, this study proposes a novel and efficient cross-view geo-localization framework called MSM-Transformer. The framework employs the Dual Attention Vision Transformer (DaViT) as the core architecture for feature extraction, which significantly enhances the modeling capacity for global features and the contextual relevance of adjacent regions. The weight-sharing mechanism in MSM-Transformer effectively reduces model complexity, making it highly suitable for deployment on embedded devices such as UAVs and satellites. Furthermore, the framework introduces a contrastive learning-based Symmetric Decoupled Contrastive Learning (DCL) loss function, which effectively mitigates the issue of sample imbalance between satellite and UAV images. Experimental validation on the University-1652 dataset demonstrates that MSM-Transformer achieves outstanding performance, delivering optimal matching results with a minimal number of parameters.
- Published
- 2024
- Full Text
- View/download PDF
12. Implementation of an Extrapolated Single-Propagation Particle Filter for Interference Source Localization Using a Sensor Network
- Author
-
Vu, Duc Dung, Biswas, Sanat K., Kan, Alan, Cetin, Ediz, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Suryadevara, Nagender Kumar, editor, George, Boby, editor, Jayasundera, Krishanthi P., editor, and Mukhopadhyay, Subhas Chandra, editor
- Published
- 2023
- Full Text
- View/download PDF
13. Cross-view Geo-localization Based on Cross-domain Matching
- Author
-
Wu, Xiaokang, Ma, Qianguang, Li, Qi, Yu, Yuanlong, Liu, Wenxi, Xhafa, Fatos, Series Editor, Xiong, Ning, editor, Li, Maozhen, editor, Li, Kenli, editor, Xiao, Zheng, editor, Liao, Longlong, editor, and Wang, Lipo, editor
- Published
- 2023
- Full Text
- View/download PDF
14. Feature Relation Guided Cross-View Image Based Geo-Localization.
- Author
-
Hou, Qingfeng, Lu, Jun, Guo, Haitao, Liu, Xiangyun, Gong, Zhihui, Zhu, Kun, and Ping, Yifan
- Subjects
- *
AUGMENTED reality , *IMAGE registration , *FEATURE extraction , *REMOTE sensing , *GEOTAGGING , *LOCALIZATION (Mathematics) , *MATHEMATICAL convolutions - Abstract
The goal of cross-view image based geo-localization is to determine the location of a given street-view image by matching it with a collection of geo-tagged aerial images, which has important applications in the fields of remote sensing information utilization and augmented reality. Most current cross-view image based geo-localization methods focus on the image content and ignore the relations between feature nodes, resulting in insufficient mining of effective information. To address this problem, this study proposes feature relation guided cross-view image based geo-localization. This method first processes aerial remote sensing images using a polar transform to achieve the geometric coarse alignment of ground-to-aerial images, and then realizes local contextual feature concern and global feature correlation modeling of the images through the feature relation guided attention generation module designed in this study. Specifically, the module includes two branches of deformable convolution based multiscale contextual feature extraction and global spatial relations mining, which effectively capture global structural information between feature nodes at different locations while correlating contextual features and guiding global feature attention generation. Finally, a novel feature aggregation module, MixVPR, is introduced to aggregate global feature descriptors to accomplish image matching and localization. After experimental validation, the cross-view image based geo-localization algorithm proposed in this study yields results of 92.08%, 97.70%, and 98.66% for the top 1, top 5, and top 10 metrics, respectively, in CVUSA, a popular public cross-view dataset, and exhibits superior performance compared to algorithms of the same type. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Fire Detection and Geo-Localization Using UAV's Aerial Images and Yolo-Based Models.
- Author
-
Choutri, Kheireddine, Lagha, Mohand, Meshoul, Souham, Batouche, Mohamed, Bouzidi, Farah, and Charef, Wided
- Subjects
FIRE detectors ,FALSE alarms ,ENVIRONMENTAL sciences ,DETECTION alarms ,CLIMATE change - Abstract
The past decade has witnessed a growing demand for drone-based fire detection systems, driven by escalating concerns about wildfires exacerbated by climate change, as corroborated by environmental studies. However, deploying existing drone-based fire detection systems in real-world operational conditions poses practical challenges, notably the intricate and unstructured environments and the dynamic nature of UAV-mounted cameras, often leading to false alarms and inaccurate detections. In this paper, we describe a two-stage framework for fire detection and geo-localization. The key features of the proposed work included the compilation of a large dataset from several sources to capture various visual contexts related to fire scenes. The bounding boxes of the regions of interest were labeled using three target levels, namely fire, non-fire, and smoke. The second feature was the investigation of YOLO models to undertake the detection and localization tasks. YOLO-NAS was retained as the best performing model using the compiled dataset with an average mAP50 of 0.71 and an F1_score of 0.68. Additionally, a fire localization scheme based on stereo vision was introduced, and the hardware implementation was executed on a drone equipped with a Pixhawk microcontroller. The test results were very promising and showed the ability of the proposed approach to contribute to a comprehensive and effective fire detection system. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent Attention.
- Author
-
Cui, Zhuofan, Zhou, Pengwei, Wang, Xiaolong, Zhang, Zilun, Li, Yingxuan, Li, Hongbo, and Zhang, Yu
- Subjects
- *
REMOTE-sensing images , *DRONE aircraft , *AERONAUTICAL navigation , *DATA augmentation , *IMAGE retrieval , *FEATURE extraction , *DRUM set - Abstract
Geo-localization has been widely applied as an important technique to get the longitude and latitude for unmanned aerial vehicle (UAV) navigation in outdoor flight. Due to the possible interference and blocking of GPS signals, the method based on image retrieval, which is less likely to be interfered with, has received extensive attention in recent years. The geo-localization of UAVs and satellites can be achieved by querying pre-obtained satellite images with GPS-tagged and drone images from different perspectives. In this paper, an image transformation technique is used to extract cross-view geo-localization information from UAVs and satellites. A single-stage training method in UAV and satellite geo-localization is first proposed, which simultaneously realizes cross-view feature extraction and image retrieval, and achieves higher accuracy than existing multi-stage training techniques. A novel piecewise soft-margin triplet loss function is designed to avoid model parameters being trapped in suboptimal sets caused by the lack of constraint on positive and negative samples. The results illustrate that the proposed loss function enhances image retrieval accuracy and realizes a better convergence. Moreover, a data augmentation method for satellite images is proposed to overcome the disproportionate numbers of image samples. On the benchmark University-1652, the proposed method achieves the state-of-the-art result with a 6.67% improvement in recall rate (R@1) and 6.13% in average precision (AP). All codes will be publicized to promote reproducibility. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. A new geographic positioning method based on horizon image retrieval
- Author
-
Lan, Gonghao, Tang, Jin, and Guo, Fan
- Published
- 2024
- Full Text
- View/download PDF
18. A Cross-View Geo-Localization Algorithm Using UAV Image and Satellite Image
- Author
-
Jiqi Fan, Enhui Zheng, Yufei He, and Jianxing Yang
- Subjects
geo-localization ,UAV ,satellite ,transformer ,style transfer ,Chemical technology ,TP1-1185 - Abstract
Within research on the cross-view geolocation of UAVs, differences in image sources and interference from similar scenes pose huge challenges. Inspired by multimodal machine learning, in this paper, we design a single-stream pyramid transformer network (SSPT). The backbone of the model uses the self-attention mechanism to enrich its own internal features in the early stage and uses the cross-attention mechanism in the later stage to refine and interact with different features to eliminate irrelevant interference. In addition, in the post-processing part of the model, a header module is designed for upsampling to generate heat maps, and a Gaussian weight window is designed to assign label weights to make the model converge better. Together, these methods improve the positioning accuracy of UAV images in satellite images. Finally, we also use style transfer technology to simulate various environmental changes in order to expand the experimental data, further proving the environmental adaptability and robustness of the method. The final experimental results show that our method yields significant performance improvement: The relative distance score (RDS) of the SSPT-384 model on the benchmark UL14 dataset is significantly improved from 76.25% to 84.40%, while the meter-level accuracy (MA) of 3 m, 5 m, and 20 m is increased by 12%, 12%, and 10%, respectively. For the SSPT-256 model, the RDS has been increased to 82.21%, and the meter-level accuracy (MA) of 3 m, 5 m, and 20 m has increased by 5%, 5%, and 7%, respectively. It still shows strong robustness on the extended thermal infrared (TIR), nighttime, and rainy day datasets.
- Published
- 2024
- Full Text
- View/download PDF
19. MTGL40-5: A Multi-Temporal Dataset for Remote Sensing Image Geo-Localization.
- Author
-
Ma, Jingjing, Pei, Shiji, Yang, Yuqun, Tang, Xu, and Zhang, Xiangrong
- Subjects
- *
REMOTE-sensing images , *IMAGE databases , *IMAGE registration , *REMOTE sensing , *PROBLEM solving - Abstract
Image-based geo-localization focuses on predicting the geographic information of query images by matching them with annotated images in a database. To facilitate relevant studies, researchers collect numerous images to build the datasets, which explore many challenges faced in real-world geo-localization applications, significantly improving their practicability. However, a crucial challenge that often arises is overlooked, named the cross-time challenge in this paper, i.e., if query and database images are taken from the same landmark but at different time periods, the significant difference in their image content caused by the time gap will notably increase the difficulty of image matching, consequently reducing geo-localization accuracy. The cross-time challenge has a greater negative influence on non-real-time geo-localization applications, particularly those involving a long time span between query and database images, such as satellite-view geo-localization. Furthermore, the rough geographic information (e.g., names) instead of precise coordinates provided by most existing datasets limits the geo-localization accuracy. Therefore, to solve these problems, we propose a dataset, MTGL40-5, which contains remote sensing (RS) satellite images captured from 40 large-scale geographic locations spanning five different years. These large-scale images are split to create query images and a database with landmark labels for geo-localization. By observing images from the same landmark but at different time periods, the cross-time challenge becomes more evident. Thus, MTGL40-5 supports researchers in tackling this challenge and further improving the practicability of geo-localization. Moreover, it provides additional geographic coordinate information, enabling the study of high-accuracy geo-localization. Based on the proposed MTGL40-5 dataset, many existing geo-localization methods, including state-of-the-art approaches, struggle to produce satisfactory results when facing the cross-time challenge. This highlights the importance of proposing MTGL40-5 to address the limitations of current methods in effectively solving the cross-time challenge. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Learning Visual Representation Clusters for Cross-View Geo-Location.
- Author
-
Song, Haoshuai, Wang, Zhen, Lei, Yi, Shi, Dianxi, Tong, Xiaochong, Lei, Yaxian, and Qiu, Chunping
- Abstract
Cross-view geo-location is a crucial research field that determines the geographic location from images taken from different viewpoints. It is often studied as a retrieval task, where the query images are with unknown locations, and the database includes images with geo-tags from a different platform. Learning image representations by neural networks is an important step, and one typical training method is using a classification loss, where cross-view images of the same locations are considered the same category. However, existing methods only focus on pushing the representation distances of different categories while ignoring the intracategory representation distances of samples from different platforms. Considering that controlling the intracategory distance can help to guide the model to extract compact category-sharing representations from cross-view images, we propose a categorized cluster loss to learn separate and compact representation clusters. Categorized cluster loss can supervise the network to learn invariant information from samples of different platforms by constraining both the intercategory and intracategory feature distances. Meanwhile, we design a category-view-stratified sampling strategy, which samples balanced inputs in terms of both category and view in each batch during the learning process. We implemented our approach with a lightweight OSNet-based network and achieved higher accuracy with fewer parameters on a typical and challenging cross-view geo-location dataset than most state-of-the-art (SOTA) methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Predictive Information Preservation via Variational Information Bottleneck for Cross-View Geo-Localization
- Author
-
Li, Wansi, Hu, Qian, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Yang, Shuo, editor, and Lu, Huimin, editor
- Published
- 2022
- Full Text
- View/download PDF
22. A news picture geo-localization pipeline based on deep learning and street view images
- Author
-
Tianyou Chu, Yumin Chen, Heng Su, Zhenzhen Xu, Guodong Chen, and Annan Zhou
- Subjects
street view images ,geo-localization ,image retrieval ,social media ,Mathematical geography. Cartography ,GA1-1776 - Abstract
Numerous news or event pictures are taken and shared on the internet every day that have abundant information worth being mined, but only a small fraction of them are geotagged. The visual content of the news image hints at clues of the geographical location because they are usually taken at the site of the incident, which provides a prerequisite for geo-localization. This paper proposes an automated pipeline based on deep learning for the geo-localization of news pictures in a large-scale urban environment using geotagged street view images as a reference dataset. The approach obtains location information by constructing an attention-based feature extraction network. Then, the image features are aggregated, and the candidate street view image results are retrieved by the selective matching kernel function. Finally, the coordinates of the news images are estimated by the kernel density prediction method. The pipeline is tested in the news pictures in Hong Kong. In the comparison experiments, the proposed pipeline shows stable performance and generalizability in the large-scale urban environment. In addition, the performance analysis of components in the pipeline shows the ability to recognize localization features of partial areas in pictures and the effectiveness of the proposed solution in news picture geo-localization.
- Published
- 2022
- Full Text
- View/download PDF
23. IML-Net: A Framework for Cross-View Geo-Localization with Multi-Domain Remote Sensing Data
- Author
-
Yiming Yan, Mengyuan Wang, Nan Su, Wei Hou, Chunhui Zhao, and Wenxuan Wang
- Subjects
geo-localization ,multi-domain ,IML-Net ,ZHcity750 ,Science - Abstract
Cross-view geolocation is a valuable yet challenging task. In practical applications, the images targeted by cross-view geolocation technology encompass multi-domain remote sensing images, including those from different platforms (e.g., drone cameras and satellites), different perspectives (e.g., nadir and oblique), and different temporal conditions (e.g., various seasons and weather conditions). Based on the characteristics of these images, we have designed an effective framework, Image Reconstruction and Multi-Unit Mutual Learning Net (IML-Net), for accomplishing cross-view geolocation tasks. By incorporating a deconvolutional network into the architecture to reconstruct images, we can better bridge the differences in remote sensing image features across different domains. This enables the mapping of target images from different platforms and perspectives into a shared latent space representation, obtaining more discriminative feature descriptors. The process enhances the robustness of feature extraction for locating targets across a wide range of perspectives. To improve the network’s performance, we introduce attention regions learned from different units as augmented data during the training process. For the current cross-view geolocation datasets, the use of large-scale datasets is limited due to high costs and privacy concerns, leading to the prevalent use of simulated data. However, real data allow the network to learn more generalizable features. To make the model more robust and stable, we collected two groups of multi-domain datasets from the Zurich and Harbin regions, incorporating real data into the cross-view geolocation task to construct the ZHcity750 Dataset. Our framework is evaluated on the cross-domain ZHcity750 Dataset, which shows competitive results compared to state-of-the-art methods.
- Published
- 2024
- Full Text
- View/download PDF
24. WAMF-FPI: A Weight-Adaptive Multi-Feature Fusion Network for UAV Localization.
- Author
-
Wang, Guirong, Chen, Jiahao, Dai, Ming, and Zheng, Enhui
- Subjects
- *
LOCALIZATION (Mathematics) , *REMOTE-sensing images , *DRONE aircraft , *MULTISCALE modeling , *SATELLITE positioning , *PROBLEM solving , *DEEP learning - Abstract
UAV localization in denial environments is a hot research topic in the field of cross-view geo-localization. The previous methods tried to find the corresponding position directly in the satellite image through the UAV image, but they lacked the consideration of spatial information and multi-scale information. Based on the method of finding points with an image, we propose a novel architecture—a Weight-Adaptive Multi-Feature fusion network for UAV localization (WAMF-FPI). We treat this positioning as a low-level task and achieve more accurate localization by restoring the feature map to the resolution of the original satellite image. Then, in order to enhance the ability of the model to solve multi-scale problems, we propose a Weight-Adaptive Multi-Feature fusion module (WAMF), which introduces a weighting mechanism to fuse different features. Finally, since all positive samples are treated in the same way in the existing methods, which is very disadvantageous for accurate localization tasks, we introduce Hanning loss to allow the model to pay more attention to the central area of the target. Our model achieves competitive results on the UL14 dataset. When using RDS as the evaluation metric, the performance of the model improves from 57.22 to 65.33 compared to Finding Point with Image (FPI). In addition, we calculate the actual distance errors (meters) to evaluate the model performance, and the localization accuracy at the 20 m level improves from 57.67% to 69.73%, showing the powerful performance of the model. Although the model shows better performance, much remains to be done before it can be applied. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. UAV's Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization.
- Author
-
Zhu, Runzhe, Yang, Mingze, Yin, Ling, Wu, Fei, and Yang, Yuncheng
- Subjects
- *
DRONE aircraft , *AERONAUTICAL navigation , *INFORMATION networks , *IMAGE registration , *DEEP learning - Abstract
Visual geo-localization plays a crucial role in positioning and navigation for unmanned aerial vehicles, whose goal is to match the same geographic target from different views. This is a challenging task due to the drastic variations in different viewpoints and appearances. Previous methods have been focused on mining features inside the images. However, they underestimated the influence of external elements and the interaction of various representations. Inspired by multimodal and bilinear pooling, we proposed a pioneering feature fusion network (MBF) to address these inherent differences between drone and satellite views. We observe that UAV's status, such as flight height, leads to changes in the size of image field of view. In addition, local parts of the target scene act a role of importance in extracting discriminative features. Therefore, we present two approaches to exploit those priors. The first module is to add status information to network by transforming them into word embeddings. Note that they concatenate with image embeddings in Transformer block to learn status-aware features. Then, global and local part feature maps from the same viewpoint are correlated and reinforced by hierarchical bilinear pooling (HBP) to improve the robustness of feature representation. By the above approaches, we achieve more discriminative deep representations facilitating the geo-localization more effectively. Our experiments on existing benchmark datasets show significant performance boosting, reaching the new state-of-the-art result. Remarkably, the recall@1 accuracy achieves 89.05% in drone localization task and 93.15% in drone navigation task in University-1652, and shows strong robustness at different flight heights in the SUES-200 dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Feature Relation Guided Cross-View Image Based Geo-Localization
- Author
-
Qingfeng Hou, Jun Lu, Haitao Guo, Xiangyun Liu, Zhihui Gong, Kun Zhu, and Yifan Ping
- Subjects
cross-view ,geo-localization ,relation guided ,deformable convolution ,multiscale contextual information ,global spatial relations mining ,Science - Abstract
The goal of cross-view image based geo-localization is to determine the location of a given street-view image by matching it with a collection of geo-tagged aerial images, which has important applications in the fields of remote sensing information utilization and augmented reality. Most current cross-view image based geo-localization methods focus on the image content and ignore the relations between feature nodes, resulting in insufficient mining of effective information. To address this problem, this study proposes feature relation guided cross-view image based geo-localization. This method first processes aerial remote sensing images using a polar transform to achieve the geometric coarse alignment of ground-to-aerial images, and then realizes local contextual feature concern and global feature correlation modeling of the images through the feature relation guided attention generation module designed in this study. Specifically, the module includes two branches of deformable convolution based multiscale contextual feature extraction and global spatial relations mining, which effectively capture global structural information between feature nodes at different locations while correlating contextual features and guiding global feature attention generation. Finally, a novel feature aggregation module, MixVPR, is introduced to aggregate global feature descriptors to accomplish image matching and localization. After experimental validation, the cross-view image based geo-localization algorithm proposed in this study yields results of 92.08%, 97.70%, and 98.66% for the top 1, top 5, and top 10 metrics, respectively, in CVUSA, a popular public cross-view dataset, and exhibits superior performance compared to algorithms of the same type.
- Published
- 2023
- Full Text
- View/download PDF
27. Fire Detection and Geo-Localization Using UAV’s Aerial Images and Yolo-Based Models
- Author
-
Kheireddine Choutri, Mohand Lagha, Souham Meshoul, Mohamed Batouche, Farah Bouzidi, and Wided Charef
- Subjects
UAV ,deep learning ,stereo vision ,YOLO models ,Pixhawk ,geo-localization ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
The past decade has witnessed a growing demand for drone-based fire detection systems, driven by escalating concerns about wildfires exacerbated by climate change, as corroborated by environmental studies. However, deploying existing drone-based fire detection systems in real-world operational conditions poses practical challenges, notably the intricate and unstructured environments and the dynamic nature of UAV-mounted cameras, often leading to false alarms and inaccurate detections. In this paper, we describe a two-stage framework for fire detection and geo-localization. The key features of the proposed work included the compilation of a large dataset from several sources to capture various visual contexts related to fire scenes. The bounding boxes of the regions of interest were labeled using three target levels, namely fire, non-fire, and smoke. The second feature was the investigation of YOLO models to undertake the detection and localization tasks. YOLO-NAS was retained as the best performing model using the compiled dataset with an average mAP50 of 0.71 and an F1_score of 0.68. Additionally, a fire localization scheme based on stereo vision was introduced, and the hardware implementation was executed on a drone equipped with a Pixhawk microcontroller. The test results were very promising and showed the ability of the proposed approach to contribute to a comprehensive and effective fire detection system.
- Published
- 2023
- Full Text
- View/download PDF
28. Applying Augmented Reality to Learn Basic Concepts of Programming in U-Learning Environment
- Author
-
Acosta, Denis, Álvarez, Margarita, Durán, Elena, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Pesado, Patricia, editor, and Eterovic, Jorge, editor
- Published
- 2021
- Full Text
- View/download PDF
29. GAN-Based Satellite Imaging: A Survey on Techniques and Applications
- Author
-
Hadi Mansourifar, Alexander Moskovitz, Ben Klingensmith, Dino Mintas, and Steven J. Simske
- Subjects
Generative adversarial network ,geo-localization ,image to image translation ,road extraction ,satellite imaging ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Satellite image analysis is widely used in many real-time applications, from agriculture to the military. Due to the wide range of Generative Adversarial Network (GAN) applications in multiple areas of satellite imaging, a comprehensive review is required in this area. This paper takes the first step in this direction by categorizing the GAN-based satellite imaging research using seven considerations. We discuss not only the challenges but also future research trends and directions. Among the major findings, we have observed increasing componentization and modularization of GANs to be used as elements of larger systems. In addition to the GAN types used exclusively in each application, we demonstrate the deep neural network architectures used as the generator structure. Eventually, we summarize the results and evaluate the significant impact of GANs on improving performance compared to traditional approaches.
- Published
- 2022
- Full Text
- View/download PDF
30. A Semantic Guidance and Transformer-Based Matching Method for UAVs and Satellite Images for UAV Geo-Localization
- Author
-
Jiedong Zhuang, Xuruoyan Chen, Ming Dai, Wenbo Lan, Yongheng Cai, and Enhui Zheng
- Subjects
Cross-view image matching ,geo-localization ,UAV image localization ,deep neural network ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
It is a challenging task for unmanned aerial vehicles (UAVs) without a positioning system to locate targets by using images. Matching drone and satellite images is one of the key steps in this task. Due to the large angle and scale gap between drone and satellite views, it is very important to extract fine-grained features with strong characterization ability. Most of the published methods are based on the CNN structure, but a lot of information will be lost when using such methods. This is caused by the limitations of the convolution operation (e.g. limited receptive field and downsampling operation). To make up for this shortcoming, a transformer-based network is proposed to extract more contextual information. The network promotes feature alignment through semantic guidance module (SGM). SGM aligns the same semantic parts in the two images by classifying each pixel in the images based on the attention of pixels. In addition, this method can be easily combined with existing methods. The proposed method has been implemented with the newest UAV-based geo-localization dataset. Compared with the existing state-of-the-art (SOTA) method, the proposed method achieves almost 8% improvement in accuracy.
- Published
- 2022
- Full Text
- View/download PDF
31. Geo-Localization Based on Dynamically Weighted Factor-Graph
- Author
-
Universidad de Alicante. Departamento de Física, Ingeniería de Sistemas y Teoría de la Señal, Universidad de Alicante. Instituto Universitario de Investigación Informática, Muñoz-Bañón, Miguel Á., Olivas, Alejandro, Velasco, Edison P., Candelas-Herías, Francisco A., Torres, Fernando, Universidad de Alicante. Departamento de Física, Ingeniería de Sistemas y Teoría de la Señal, Universidad de Alicante. Instituto Universitario de Investigación Informática, Muñoz-Bañón, Miguel Á., Olivas, Alejandro, Velasco, Edison P., Candelas-Herías, Francisco A., and Torres, Fernando
- Abstract
Feature-based geo-localization relies on associating features extracted from aerial imagery with those detected by the vehicle's sensors. This requires that the type of landmarks must be observable from both sources. This lack of variety of feature types generates poor representations that lead to outliers and deviations produced by ambiguities and lack of detections, respectively. To mitigate these drawbacks, in this letter, we present a dynamically weighted factor graph model for the vehicle's trajectory estimation. The weight adjustment in this implementation depends on information quantification in the detections performed using a LiDAR sensor. Also, a prior (GNSS-based) error estimation is included in the model. Then, when the representation becomes ambiguous or sparse, the weights are dynamically adjusted to rely on the corrected prior trajectory, mitigating outliers and deviations in this way. We compare our method against state-of-the-art geo-localization ones in a challenging and ambiguous environment, where we also cause detection losses. We demonstrate mitigation of the mentioned drawbacks where the other methods fail.
- Published
- 2024
32. Digital Scalability and Growth Options
- Author
-
Moro Visconti, Roberto and Moro Visconti, Roberto
- Published
- 2020
- Full Text
- View/download PDF
33. A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent Attention
- Author
-
Zhuofan Cui, Pengwei Zhou, Xiaolong Wang, Zilun Zhang, Yingxuan Li, Hongbo Li, and Yu Zhang
- Subjects
geo-localization ,UAV ,satellite ,transformer ,cross-view ,Science - Abstract
Geo-localization has been widely applied as an important technique to get the longitude and latitude for unmanned aerial vehicle (UAV) navigation in outdoor flight. Due to the possible interference and blocking of GPS signals, the method based on image retrieval, which is less likely to be interfered with, has received extensive attention in recent years. The geo-localization of UAVs and satellites can be achieved by querying pre-obtained satellite images with GPS-tagged and drone images from different perspectives. In this paper, an image transformation technique is used to extract cross-view geo-localization information from UAVs and satellites. A single-stage training method in UAV and satellite geo-localization is first proposed, which simultaneously realizes cross-view feature extraction and image retrieval, and achieves higher accuracy than existing multi-stage training techniques. A novel piecewise soft-margin triplet loss function is designed to avoid model parameters being trapped in suboptimal sets caused by the lack of constraint on positive and negative samples. The results illustrate that the proposed loss function enhances image retrieval accuracy and realizes a better convergence. Moreover, a data augmentation method for satellite images is proposed to overcome the disproportionate numbers of image samples. On the benchmark University-1652, the proposed method achieves the state-of-the-art result with a 6.67% improvement in recall rate (R@1) and 6.13% in average precision (AP). All codes will be publicized to promote reproducibility.
- Published
- 2023
- Full Text
- View/download PDF
34. MTGL40-5: A Multi-Temporal Dataset for Remote Sensing Image Geo-Localization
- Author
-
Jingjing Ma, Shiji Pei, Yuqun Yang, Xu Tang, and Xiangrong Zhang
- Subjects
geo-localization ,remote sensing satellite images ,geographic coordinate information ,Science - Abstract
Image-based geo-localization focuses on predicting the geographic information of query images by matching them with annotated images in a database. To facilitate relevant studies, researchers collect numerous images to build the datasets, which explore many challenges faced in real-world geo-localization applications, significantly improving their practicability. However, a crucial challenge that often arises is overlooked, named the cross-time challenge in this paper, i.e., if query and database images are taken from the same landmark but at different time periods, the significant difference in their image content caused by the time gap will notably increase the difficulty of image matching, consequently reducing geo-localization accuracy. The cross-time challenge has a greater negative influence on non-real-time geo-localization applications, particularly those involving a long time span between query and database images, such as satellite-view geo-localization. Furthermore, the rough geographic information (e.g., names) instead of precise coordinates provided by most existing datasets limits the geo-localization accuracy. Therefore, to solve these problems, we propose a dataset, MTGL40-5, which contains remote sensing (RS) satellite images captured from 40 large-scale geographic locations spanning five different years. These large-scale images are split to create query images and a database with landmark labels for geo-localization. By observing images from the same landmark but at different time periods, the cross-time challenge becomes more evident. Thus, MTGL40-5 supports researchers in tackling this challenge and further improving the practicability of geo-localization. Moreover, it provides additional geographic coordinate information, enabling the study of high-accuracy geo-localization. Based on the proposed MTGL40-5 dataset, many existing geo-localization methods, including state-of-the-art approaches, struggle to produce satisfactory results when facing the cross-time challenge. This highlights the importance of proposing MTGL40-5 to address the limitations of current methods in effectively solving the cross-time challenge.
- Published
- 2023
- Full Text
- View/download PDF
35. Dual attention and dual fusion: An accurate way of image-based geo-localization.
- Author
-
Yuan, Yuan, Sun, Bo, and Liu, Ganchao
- Subjects
- *
LOCALIZATION (Mathematics) , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *DRONE aircraft , *FEATURE extraction - Abstract
When GPS signal is interfered or lost, the visual geo-localization method is particularly important for Unmanned Aerial Vehicle (UAV). Since matching UAV images with satellite maps is a multi-source and multi-view problem, visual geo-localization is very challenging. Most existing methods use Convolutional Neural Network (CNN), which extract the final output of the backbone Network to predict the similarity between UAV images and satellite maps. Due to continuous stacked convolution and pooling, rich local information is gradually lost while semantic information is acquired. To solve this problem, a dual attention and dual fusion (DADF) scene matching algorithm is proposed. The contributions of this paper are as follows: 1) In order to achieve accurate matching between UAV and satellite images, a visual geo-localization algorithm based on siamese network is designed. 2) In order to improve the ability of semantic feature extraction, a dual-attention model is constructed. The network pays more attention to the parts that are useful for similarity metric. 3) A dual fusion model is established. According to the feature fusion method and multi-level matching result fusion algorithm, the confidence of matching is improved. To verify the performance of the proposed approach, LA850 and NWPU-ChangAn datasets were collected and enhanced. The experimental results show that the proposed algorithm is more efficient than comparison algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. A Transformer-Based Feature Segmentation and Region Alignment Method for UAV-View Geo-Localization.
- Author
-
Dai, Ming, Hu, Jianhong, Zhuang, Jiedong, and Zheng, Enhui
- Subjects
- *
REMOTE-sensing images , *FEATURE extraction , *DRONE aircraft , *TASK analysis , *IMAGE retrieval - Abstract
Cross-view geo-localization is a task of matching the same geographic image from different views, e.g., unmanned aerial vehicle (UAV) and satellite. The most difficult challenges are the position shift and the uncertainty of distance and scale. Existing methods are mainly aimed at digging for more comprehensive fine-grained information. However, it underestimates the importance of extracting robust feature representation and the impact of feature alignment. The CNN-based methods have achieved great success in cross-view geo-localization. However it still has some limitations, e.g., it can only extract part of the information in the neighborhood and some scale reduction operations will make some fine-grained information lost. In particular, we introduce a simple and efficient transformer-based structure called Feature Segmentation and Region Alignment (FSRA) to enhance the model’s ability to understand contextual information as well as to understand the distribution of instances. Without using additional supervisory information, FSRA divides regions based on the heat distribution of the transformer’s feature map, and then aligns multiple specific regions in different views one on one. Finally, FSRA integrates each region into a set of feature representations. The difference is that FSRA does not divide regions manually, but automatically based on the heat distribution of the feature map. So that specific instances can still be divided and aligned when there are significant shifts and scale changes in the image. In addition, a multiple sampling strategy is proposed to overcome the disparity in the number of satellite images and that of images from other sources. Experiments show that the proposed method has superior performance and achieves the state-of-the-art in both tasks of drone view target localization and drone navigation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. UAV-Satellite View Synthesis for Cross-View Geo-Localization.
- Author
-
Tian, Xiaoyang, Shao, Jie, Ouyang, Deqiang, and Shen, Heng Tao
- Subjects
- *
REMOTE-sensing images , *DRONE aircraft , *IMAGE registration - Abstract
The goal of cross-view image matching based on geo-localization is to determine the location of a given ground-view image (front view) by matching it with a group of satellite-view images (vertical view) with geographic tags. Due to the rapid development of unmanned aerial vehicle (UAV) technology in recent years, it has provided a real viewpoint close to 45 degrees (oblique view) to bridge the visual gap between views. However, existing methods ignore the direct geometric space correspondence of UAV-satellite views, and only use brute force for feature matching, leading to inferior performance. In this context, we propose an end-to-end cross-view matching method that integrates cross-view synthesis module and geo-localization module, which fully considers the spatial correspondence of UAV-satellite views and the surrounding area information. To be specific, the cross-view synthesis module includes two parts: the oblique view of UAV is first converted to the vertical view by perspective projection transformation (PPT), which makes the UAV image closer to the satellite image; then we use conditional generative adversarial nets (CGAN) to synthesize the UAV image with vertical view style, which is close to the real satellite image by learning the converted UAV as the input image and the real satellite image as the label. Geo-localization module refers to existing local pattern network (LPN), which explicitly considers the surrounding environment of the target building. These modules are integrated in a single architecture called PCL, which mutually reinforce each other. Our method is superior to the existing UAV-satellite cross-view methods, which improves by about 5%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization.
- Author
-
Lin, Jinliang, Zheng, Zhedong, Zhong, Zhun, Luo, Zhiming, Li, Shaozi, Yang, Yi, and Sebe, Nicu
- Subjects
- *
IMAGE registration , *TEACHING aids , *LEARNING - Abstract
In this paper, we study the cross-view geo-localization problem to match images from different viewpoints. The key motivation underpinning this task is to learn a discriminative viewpoint-invariant visual representation. Inspired by the human visual system for mining local patterns, we propose a new framework called RK-Net to jointly learn the discriminative Representation and detect salient Keypoints with a single Network. Specifically, we introduce a Unit Subtraction Attention Module (USAM) that can automatically discover representative keypoints from feature maps and draw attention to the salient regions. USAM contains very few learning parameters but yields significant performance improvement and can be easily plugged into different networks. We demonstrate through extensive experiments that (1) by incorporating USAM, RK-Net facilitates end-to-end joint learning without the prerequisite of extra annotations. Representation learning and keypoint detection are two highly-related tasks. Representation learning aids keypoint detection. Keypoint detection, in turn, enriches the model capability against large appearance changes caused by viewpoint variants. (2) USAM is easy to implement and can be integrated with existing methods, further improving the state-of-the-art performance. We achieve competitive geo-localization accuracy on three challenging datasets, i. e., University-1652, CVUSA and CVACT. Our code is available at https://github.com/AggMan96/RK-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization.
- Author
-
Wang, Tingyu, Zheng, Zhedong, Yan, Chenggang, Zhang, Jiyong, Sun, Yaoqi, Zheng, Bolun, and Yang, Yi
- Subjects
- *
SCALABILITY , *DEEP learning , *FEATURE extraction , *TASK analysis , *IMAGE retrieval - Abstract
Cross-view geo-localization is to spot images of the same geographic target from different platforms, e.g., drone-view cameras and satellites. It is challenging in the large visual appearance changes caused by extreme viewpoint variations. Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center, but underestimate the contextual information in neighbor areas. In this work, we argue that neighbor areas can be leveraged as auxiliary information, enriching discriminative clues for geo-localization. Specifically, we introduce a simple and effective deep neural network, called Local Pattern Network (LPN), to take advantage of contextual information in an end-to-end manner. Without using extra part estimators, LPN adopts a square-ring feature partition strategy, which provides the attention according to the distance to the image center. It eases the part matching and enables the part-wise representation learning. Owing to the square-ring partition design, the proposed LPN has good scalability to rotation variations and achieves competitive results on three prevailing benchmarks, i.e., University-1652, CVUSA and CVACT. Besides, we also show the proposed LPN can be easily embedded into other frameworks to further boost performance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. A news picture geo-localization pipeline based on deep learning and street view images.
- Author
-
Chu, Tianyou, Chen, Yumin, Su, Heng, Xu, Zhenzhen, Chen, Guodong, and Zhou, Annan
- Subjects
- *
DEEP learning , *GEOTAGGING , *FEATURE extraction , *KERNEL functions , *PICTURES - Abstract
Numerous news or event pictures are taken and shared on the internet every day that have abundant information worth being mined, but only a small fraction of them are geotagged. The visual content of the news image hints at clues of the geographical location because they are usually taken at the site of the incident, which provides a prerequisite for geo-localization. This paper proposes an automated pipeline based on deep learning for the geo-localization of news pictures in a large-scale urban environment using geotagged street view images as a reference dataset. The approach obtains location information by constructing an attention-based feature extraction network. Then, the image features are aggregated, and the candidate street view image results are retrieved by the selective matching kernel function. Finally, the coordinates of the news images are estimated by the kernel density prediction method. The pipeline is tested in the news pictures in Hong Kong. In the comparison experiments, the proposed pipeline shows stable performance and generalizability in the large-scale urban environment. In addition, the performance analysis of components in the pipeline shows the ability to recognize localization features of partial areas in pictures and the effectiveness of the proposed solution in news picture geo-localization. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. WAMF-FPI: A Weight-Adaptive Multi-Feature Fusion Network for UAV Localization
- Author
-
Guirong Wang, Jiahao Chen, Ming Dai, and Enhui Zheng
- Subjects
UAV localization ,geo-localization ,deep learning ,transformer ,Science - Abstract
UAV localization in denial environments is a hot research topic in the field of cross-view geo-localization. The previous methods tried to find the corresponding position directly in the satellite image through the UAV image, but they lacked the consideration of spatial information and multi-scale information. Based on the method of finding points with an image, we propose a novel architecture—a Weight-Adaptive Multi-Feature fusion network for UAV localization (WAMF-FPI). We treat this positioning as a low-level task and achieve more accurate localization by restoring the feature map to the resolution of the original satellite image. Then, in order to enhance the ability of the model to solve multi-scale problems, we propose a Weight-Adaptive Multi-Feature fusion module (WAMF), which introduces a weighting mechanism to fuse different features. Finally, since all positive samples are treated in the same way in the existing methods, which is very disadvantageous for accurate localization tasks, we introduce Hanning loss to allow the model to pay more attention to the central area of the target. Our model achieves competitive results on the UL14 dataset. When using RDS as the evaluation metric, the performance of the model improves from 57.22 to 65.33 compared to Finding Point with Image (FPI). In addition, we calculate the actual distance errors (meters) to evaluate the model performance, and the localization accuracy at the 20 m level improves from 57.67% to 69.73%, showing the powerful performance of the model. Although the model shows better performance, much remains to be done before it can be applied.
- Published
- 2023
- Full Text
- View/download PDF
42. UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
- Author
-
Runzhe Zhu, Mingze Yang, Ling Yin, Fei Wu, and Yuncheng Yang
- Subjects
cross-view image matching ,geo-localization ,UAV image localization ,multimodal ,transformer ,bilinear pooling ,Chemical technology ,TP1-1185 - Abstract
Visual geo-localization plays a crucial role in positioning and navigation for unmanned aerial vehicles, whose goal is to match the same geographic target from different views. This is a challenging task due to the drastic variations in different viewpoints and appearances. Previous methods have been focused on mining features inside the images. However, they underestimated the influence of external elements and the interaction of various representations. Inspired by multimodal and bilinear pooling, we proposed a pioneering feature fusion network (MBF) to address these inherent differences between drone and satellite views. We observe that UAV’s status, such as flight height, leads to changes in the size of image field of view. In addition, local parts of the target scene act a role of importance in extracting discriminative features. Therefore, we present two approaches to exploit those priors. The first module is to add status information to network by transforming them into word embeddings. Note that they concatenate with image embeddings in Transformer block to learn status-aware features. Then, global and local part feature maps from the same viewpoint are correlated and reinforced by hierarchical bilinear pooling (HBP) to improve the robustness of feature representation. By the above approaches, we achieve more discriminative deep representations facilitating the geo-localization more effectively. Our experiments on existing benchmark datasets show significant performance boosting, reaching the new state-of-the-art result. Remarkably, the recall@1 accuracy achieves 89.05% in drone localization task and 93.15% in drone navigation task in University-1652, and shows strong robustness at different flight heights in the SUES-200 dataset.
- Published
- 2023
- Full Text
- View/download PDF
43. Absolute Orientation and Localization Estimation from an Omnidirectional Image
- Author
-
Liu, Ruyu, Zhang, Jianhua, Yin, Kejie, Pan, Zhiyin, Lin, Ruihao, Chen, Shengyong, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Geng, Xin, editor, and Kang, Byeong-Ho, editor
- Published
- 2018
- Full Text
- View/download PDF
44. Memory Segment Matching Network Based Image Geo-Localization
- Author
-
Jienan Chen, Yunzhi Duan, Gerald E. Sobelman, and Cong Zhang
- Subjects
Computer vision ,image matching ,artificial intelligence ,memory segment matching network ,geo-localization ,hidden Markov model (HMM) ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Humans and other animals can easily perform self-localization by means of vision. However, that remains a challenging task for computer vision algorithms with traditional image matching methods. In this paper, we propose a memory segment matching network for image geo-localization that is inspired by the discovery of the place cell in the brain by using artificial intelligence. The place cell becomes active when an animal enters a particular location, where the external sensory information in the environment matches features stored in the hippocampus. In order to emulate the operation of the place cell, we employ a convolutional neural network (CNN) and a long-short term memory (LSTM) to extract the visual features of the environment. The extracted features are stored as segmented memory bounded with a location tag. A matching network is utilized to calculate the cross firing probability of the memory segment and the current input visual data. The final prediction of the location is obtained by sending the cross firing probability to an inference engine that uses a hidden Markov model (HMM). According to the simulation results, the localization accuracy reaches up to 95% for the datasets tested, which outperforms the state-of-the-art by 17% in localization detection accuracy.
- Published
- 2019
- Full Text
- View/download PDF
45. On the role of geometry in geo-localization.
- Author
-
Kadosh, Moti, Moses, Yael, and Shamir, Ariel
- Subjects
GEOMETRY ,GRAPHICAL projection ,URBAN renewal - Abstract
Consider the geo-localization task of finding the pose of a camera in a large 3D scene from a single image. Most existing CNN-based methods use as input textured images. We aim to experimentally explore whether texture and correlation between nearby images are necessary in a CNN-based solution for the geo-localization task. To do so, we consider lean images, textureless projections of a simple 3D model of a city. They only contain information related to the geometry of the scene viewed (edges, faces, and relative depth). The main contributions of this paper are: (i) to demonstrate the ability of CNNs to recover camera pose using lean images; and (ii) to provide insight into the role of geometry in the CNN learning process. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
46. Image-Based Geo-Localization Using Satellite Imagery.
- Author
-
Hu, Sixing and Lee, Gim Hee
- Subjects
- *
REMOTE-sensing images , *PATTERN recognition systems , *ARTIFICIAL neural networks , *IMAGE registration , *BEACONS , *STREAMING video & television , *TELECOMMUNICATION satellites , *DESCRIPTOR systems - Abstract
The problem of localization on a geo-referenced satellite map given a query ground view image is useful yet remains challenging due to the drastic change in viewpoint. To this end, in this paper we work on the extension of our earlier work on the Cross-View Matching Network (CVM-Net) (Hu et al. in IEEE conference on computer vision and pattern recognition (CVPR), 2018) for the ground-to-aerial image matching task since the traditional image descriptors fail due to the drastic viewpoint change. In particular, we show more extensive experimental results and analyses of the network architecture on our CVM-Net. Furthermore, we propose a Markov localization framework that enforces the temporal consistency between image frames to enhance the geo-localization results in the case where a video stream of ground view images is available. Experimental results show that our proposed Markov localization framework can continuously localize the vehicle within a small error on our Singapore dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
47. Dashboard de Interacção Atleta-Treinador na Análise de Desempenhos.
- Author
-
Maria Guimarães, José and Pestana, Gabriel
- Abstract
Copyright of CISTI (Iberian Conference on Information Systems & Technologies / Conferência Ibérica de Sistemas e Tecnologias de Informação) Proceedings is the property of Conferencia Iberica de Sistemas Tecnologia de Informacao and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2019
48. PetalView: Fine-grained Location and Orientation Extraction of Street-view Images via Cross-view Local Search
- Author
-
Hu, Wenmiao, Zhang, Yichen, Liang, Yuxuan, Han, Xianjing, Yin, Yifang, Kruppa, Hannes, Ng, See-Kiong, Zimmermann, Roger, Hu, Wenmiao, Zhang, Yichen, Liang, Yuxuan, Han, Xianjing, Yin, Yifang, Kruppa, Hannes, Ng, See-Kiong, and Zimmermann, Roger
- Abstract
Satellite-based street-view information extraction by cross-view matching refers to a task that extracts the location and orientation information of a given street-view image query by using one or multiple geo-referenced satellite images. Recent work has initiated a new research direction to find accurate information within a local area covered by one satellite image centered at a location prior (e.g., from GPS). It can be used as a standalone solution or complementary step following a large-scale search with multiple satellite candidates. However, these existing works require an accurate initial orientation (angle) prior (e.g., from IMU) and/or do not efficiently search through all possible poses. To allow efficient search and to give accurate prediction regardless of the existence or the accuracy of the angle prior, we present PetalView extractors with multi-scale search. The PetalView extractors give semantically meaningful features that are equivalent across two drastically different views, and the multi-scale search strategy efficiently inspects the satellite image from coarse to fine granularity to provide sub-meter and sub-degree precision extraction. Moreover, when an angle prior is given, we propose a learnable prior angle mixer to utilize this information. Our method obtains the best performance on the VIGOR dataset and successfully improves the performance on KITTI dataset test∼1 set with the recall within 1 meter (r@1m) for location estimation to 68.88% and recall within 1 degree (r@1d) 21.10% when no angle prior is available, and with angle prior achieves stable estimations at r@1m and r@1d above 70% and 21%, up to a 40-degree noise level. © 2023 Owner/Author.
- Published
- 2023
49. Aerial geodetic total station platform for precise active positioning in GNSS-degraded environments.
- Author
-
Partsinevelos, Panagiotis, Petrakis, Georgios, Antonopoulos, Angelos, Fotakis, Tzanis, Bikos, Stathis, Charokopos, Zisis, and Tripolitsiotis, Achilleas
- Subjects
- *
GLOBAL Positioning System , *FIDUCIAL markers (Imaging systems) , *AERIAL surveys , *OPTICAL sensors , *MOBILE apps - Abstract
One of the main applications of UAVs (Uncrewed Aerial Vehicles) is precise mapping; however, most of the studies are specialized either on rapid-mapping without focusing on characteristic point localization, or point positioning without concentrating upon the real-time or high accuracy. In this study, an active localization method for GNSS (Global Navigation Satellite System)-degraded environments is proposed using a custom-built UAV as an aerial geodetic total station platform. The UAV equipped with an RTK (Real-time-Kinematics)-GNSS receiver and an optimized gimbal that carries an optical sensor and a laser range-finder, is able to autonomously detect, track and localize fiducial targets, using the UAV's orientation and position, exporting their coordinates in WGS 84 geodetic reference system during the flight. The system has been validated with a significant number of experiments in various environments with increasing difficulty, providing a three-dimensional error in a range of 4–15 cm. • Implementation of an aerial surveying framework. • Real-time precise localization method for GNSS-degraded environments. • Autonomous target detection and tracking. • Mobile application for the design and control of the localization method in the field. • System evaluation in various GNSS-degraded environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Application of geographic population structure (GPS) algorithm for biogeographical analyses of populations with complex ancestries: a case study of South Asians from 1000 genomes project
- Author
-
Ranajit Das and Priyanka Upadhyai
- Subjects
Geographical population structure (GPS) ,Admixture ,Highly admixed populations ,Geo-localization ,South Asian population history ,Genetics ,QH426-470 - Abstract
Abstract Background The utilization of biological data to infer the geographic origins of human populations has been a long standing quest for biologists and anthropologists. Several biogeographical analysis tools have been developed to infer the geographical origins of human populations utilizing genetic data. However due to the inherent complexity of genetic information these approaches are prone to misinterpretations. The Geographic Population Structure (GPS) algorithm is an admixture based tool for biogeographical analyses and has been employed for the geo-localization of various populations worldwide. Here we sought to dissect its sensitivity and accuracy for localizing highly admixed groups. Given the complex history of population dispersal and gene flow in the Indian subcontinent, we have employed the GPS tool to localize five South Asian populations, Punjabi, Gujarati, Tamil, Telugu and Bengali from the 1000 Genomes project, some of whom were recent migrants to USA and UK, using populations from the Indian subcontinent available in Human Genome Diversity Panel (HGDP) and those previously described as reference. Results Our findings demonstrate reasonably high accuracy with regards to GPS assignment even for recent migrant populations sampled elsewhere, namely the Tamil, Telugu and Gujarati individuals, where 96%, 87% and 79% of the individuals, respectively, were positioned within 600 km of their native locations. While the absence of appropriate reference populations resulted in moderate-to-low levels of precision in positioning of Punjabi and Bengali genomes. Conclusions Our findings reflect that the GPS approach is useful but likely overtly dependent on the relative proportions of admixture in the reference populations for determination of the biogeographical origins of test individuals. We conclude that further modifications are desired to make this approach more suitable for highly admixed individuals.
- Published
- 2017
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.