458 results
Search Results
2. Multi-class Object Detection in Urban Scenes Based on Deep Learning.
- Author
-
Wang, Yunning, Liu, Xianglei, and Wang, Runjie
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,URBAN planning ,SMART cities ,ENVIRONMENTAL monitoring - Abstract
The rapid development of urbanization presents challenges and requirements for multi-class object detection in urban scenes. Accurately identifying buildings, vehicles, and trees in urban scenes can optimize urban planning, traffic management, monitoring environmental conditions, and promote the development of smart cities. Traditional target detection methods perform poorly in complex urban environments, while deep learning technology achieves accurate target recognition and positioning by automatically extracting high-level semantic features. In this study, we chose to use the YOLOv5s algorithm for multi-class target detection in urban scenes. YOLOv5s is a lightweight deep learning model with small storage space and efficient detection speed. In this paper, the Potsdam area data published by ISPRS is used to make the label data of buildings, vehicles and trees. The YOLOv5s algorithm is used to iteratively train the model. The results show that the mAP value detected by the YOLOv5s model can reach 82.83%. The experimental results show that the algorithm shows higher accuracy than SSD and Faster R-CNN in tree detection. Although it has a slight decline in building and vehicle detection, considering the factors such as detection accuracy, speed, and model size, the YOLOv5s algorithm has a better recognition and detection effect for the detection of multi-class targets in urban scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Advancing Coral Structural Connectivity Analysis through Deep Learning and Remote Sensing: A Case Study of South Pacific Tetiaroa Island.
- Author
-
Zhang, Yunhan, Qin, Jiangying, Li, Ming, Han, Qiyao, Gruen, Armin, Li, Deren, and Zhong, Jiageng
- Subjects
DEEP learning ,CORAL reef conservation ,REMOTE sensing ,DISTANCE education ,CORAL reef management ,CORAL reefs & islands ,CORALS ,CORAL reef restoration - Abstract
Structural connectivity is an important factor in preserving coral diversity. It maintains the stability and adaptability of coral reef ecosystems by facilitating ecological flow, species migration, and gene exchange between coral communities. However, there has always been a lack of consistent solutions for accurate structural connectivity describing and quantifying, which has hindered the understanding of the complex ecological processes in coral reefs. Based on this, this paper proposes a framework that uses advanced remote sensing and deep learning technologies to assess coral structural connectivity. Specifically, accurate coral patches are firstly identified through image segmentation techniques. And the structural connectivity is quantified by assessing the connectivity patterns between and within these coral patches. Furthermore, Tetiaroa Island in the South Pacific is used as a case study to validate the effectiveness and accuracy of the framework in assessing coral structural connectivity. The experimental results demonstrate that the framework proposed in this paper provides a powerful tool for understanding the internal ecological processes and external spatial patterns of coral reef ecosystems, thereby promoting scientific understanding and effective management of coral reef conservation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. WIFI LOG-BASED STUDENT BEHAVIOR ANALYSIS AND VISUALIZATION SYSTEM.
- Author
-
Chen, F., Jing, C., Zhang, H., and Lv, X.
- Subjects
PSYCHOLOGY of students ,BEHAVIORAL assessment ,BEHAVIORAL research ,DEEP learning ,DATA mining - Abstract
Student behavior research can improve learning efficiency, provide decision evidences for infrastructure management. Existing campus-scale behavioral analysis work have not taken into account the students characteristics and spatiotemporal pattern. Moreover, the visualization methods are weak in wholeness, intuitiveness and interactivity perspectives. In this paper, we design a geospatial dashboard-based student behavior analysis and visualization system considering students characteristics and spatiotemporal pattern. This system includes four components: user monitoring, data mining analysis, behavior prediction and spatiotemporal visualization. Furthermore, a deep learning model based on LSTNet to predict student behaviour. Our work takes WiFi log data of a university in Beijing as dataset. The results show that this system can identify student behavior patterns at a finer granularity by visualization method, which is helpful in improving learning and living efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. Burned Area Detection with Sentinel-2A Data: Using Deep Learning Techniques with eXplainable Artificial Intelligence.
- Author
-
Yilmaz, Elif Ozlem and Kavzoglu, Taskin
- Subjects
NORMALIZED difference vegetation index ,CONVOLUTIONAL neural networks ,ARTIFICIAL intelligence ,DEEP learning - Abstract
Annually, a considerable quantity of forest is burned on a global scale. Therefore, it is essential to obtain precise and fast information regarding the size of burned regions in order to effectively monitor the adverse consequences of wildfires. The objective of this investigation is to indicate the effectiveness and usefulness of a deep learning (DL) architecture, such as Convolutional Neural Networks (CNNs), in the mapping of areas affected by fire, employing an eXplainable artificial intelligence (XAI) algorithm known as SHapley Additive exPlanations (SHAP) with accuracy evaluation criteria. Furthermore, this paper presents the evaluation of the Çanakkale-Kizilkeçili village wildfire. The research investigated the impacts of a variety of spectral indices, including the normalized burn index (NBR), differentiated normalized burn index (dNBR), Green-Red Vegetation Index (GRVI), simple ratio vegetation index (RVI), and normalized difference vegetation index (NDVI). At the end of the training process, the model achieved a training accuracy of approximately 0.99, with model loss values converging to approximately 0.1. The findings of the burned area identification analysis indicate that by incorporating spectral indices as supplementary information, the CNN model achieved a high level of accuracy, with an overall accuracy of 98.88% and a Kappa Coefficient of 0.98. Additionally, the SHAP technique was employed to gain insights into the output of the models. The feature importances of the spectral bands were determined through the SHAP analysis of the CNN model. Hence, the significance of the auxiliary data generated by the NBR, dNBR, and NDVI indices was identified as being the highest among the original bands and auxiliary data employed in this investigation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Innovative Research on Small Object Detection and Recognition in Remote Sensing Images Using YOLOv5.
- Author
-
Jiang, Shan, Huang, He, Yang, Junxing, Zhang, Xin, and Wang, Siqi
- Subjects
REMOTE-sensing images ,IMAGE processing ,DEEP learning ,REMOTE sensing ,URBAN planning - Abstract
With the increase of remote sensing image acquisition methods and the number of remote sensing image data, the traditional manual annotation and recognition methods can no longer meet the needs of the present production life. This study explores the use of deep learning techniques to improve the efficiency and accuracy of target detection in remote sensing satellite images, especially for small targets. Traditional target detection methods often face challenges in recognition accuracy and processing speed due to the specificity and complexity of satellite images, such as large size, variable lighting conditions and complex background. Therefore, this paper adopts the YOLOv5 model and introduces the CBAM (Convolutional Block Attention Module) attention mechanism, which significantly improves the detection of small and dense targets. Experimental validation, using the improved YOLOv5 model on the VisDrone2021 dataset, demonstrates that the model improves the mean average percision (mAP) by 1.9% while maintaining realtime performance. This paper provides new ideas for remote sensing image processing, especially for applications in the fields of urban planning and automatic driving, etc. Despite the progress made in this study, the detection of small targets in remote sensing images, the limited classification accuracy and the detection of dynamic targets still need further research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Deep Convolutional Network Based on Attention Mechanism for Matching Optical and SAR Images.
- Author
-
He, Haiqing, Yu, Shixun, Zhou, Fuyang, Zhang, Hai, and Chen, Longyu
- Subjects
OPTICAL images ,SYNTHETIC apertures ,SYNTHETIC aperture radar ,DEEP learning ,IMAGE recognition (Computer vision) - Abstract
Complex geometric distortions and nonlinear radiation differences between optical and synthetic aperture radar (SAR) images present challenges for the matching of sufficient and evenly distributed corresponding points. To address this problem, this paper proposes a deep convolutional network based on an attention mechanism for matching optical and SAR images. In order to obtain robust feature points, we employ phase consistency instead of image intensity and gradient information for feature detection. A deep convolutional network (DCN) is designed to extract high-level semantic features between optical and SAR images, providing robustness to geometric distortion and nonlinear radiation changes. Notably, incorporating multiple inverted residual structures in the DCN facilitates efficient extraction of local and global features, promoting feature reuse, and reducing the loss of key features. Furthermore, a dense feature fusion module based on coordinate attention is designed, focusing on the spatial positional information of effective features, integrating key features into deep descriptors to enhance the robustness of deep descriptors to nonlinear radiometric differences. A coarse-to-fine strategy is then employed to enhance accuracy by eliminating mismatches. Experimental results demonstrate that the proposed network performs better than the manually designed descriptors-based methods and the stateof- the-art deep learning networks in both matching effectiveness and accuracy. Specifically, the number of matches achieved is approximately 2 times greater than that of other methods, with a 10% improvement in F-measure. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Deep Learning-based DSM Generation from Dual-Aspect SAR Data.
- Author
-
Recla, Michael and Schmitt, Michael
- Subjects
DEEP learning ,ARTIFICIAL neural networks ,SYNTHETIC aperture radar ,DATA mining ,REMOTE sensing ,GEOMETRIC modeling - Abstract
Rapid mapping demands efficient methods for a fast extraction of information from satellite data while minimizing data requirements. This paper explores the potential of deep learning for the generation of high-resolution urban elevation data from Synthetic Aperture Radar (SAR) imagery. In order to mitigate occlusion effects caused by the side-looking nature of SAR remote sensing, two SAR images from opposing aspects are leveraged and processed in an end-to-end deep neural network. The presented approach is the first of its kind to implicitly handle the transition from the SAR-specific slant range geometry to a ground-based mapping geometry within the model architecture. Comparative experiments demonstrate the superiority of the dual-aspect fusion over single-image methods in terms of reconstruction quality and geolocation accuracy. Notably, the model exhibits robust performance across diverse acquisition modes and geometries, showcasing its generalizability and suitability for height mapping applications. The study's findings underscore the potential of deep learning-driven SAR techniques in generating high-quality urban surface models efficiently and economically. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. AUTOMATIC SURFACE DAMAGE CLASSIFICATION DEVELOPED BASED ON DEEP LEARNING FOR WOODEN ARCHITECTURAL HERITAGE.
- Author
-
Lee, J. and Yu, J. M.
- Subjects
DEEP learning ,MACHINE learning ,HISTORIC sites ,CULTURAL property ,CLASSIFICATION - Abstract
In this paper, we propose a system that automatically classifies the surface damages of wooden architectural cultural heritage based on deep learning algorithms. Commonly, on-site surface damage inspections of cultural heritage are carried out manually by field experts. However, it is difficult to manage cultural heritage because experts are not always onsite to check for damage. To overcome this problem, a deep-learning-based classification method is designed to detect surface damage automatically so that cultural heritage monitoring can be done in real time. The dataset required for the development of the deep learning model utilized 4,000 images taken directly from cultural heritage sites. As a result of a comparative analysis of the performance of four deep learning models for several examples of wooden architectural heritage, the damage detection rate of the deep learning model built in this study showed excellent performance between 94.00 and 96.50%. When gradient-weighted class activation mapping is applied to visualize the damage detection results, the performance of the model with the best performance stood out. The results of this paper are significant as a basic study of the development of a real-time remote damage detection system applicable to cultural heritage sites. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. DEEP LEARNING FOR SEMANTIC SEGMENTATION OF CORAL IMAGES IN UNDERWATER PHOTOGRAMMETRY.
- Author
-
Zhang, H., Gruen, A., and Li, M.
- Subjects
CORAL reefs & islands ,IMAGE segmentation ,CORALS ,IMAGE processing ,PHOTOGRAMMETRY ,DEEP learning ,TEST methods - Abstract
Regular monitoring activities are important for assessing the influence of unfavourable factors on corals and tracking subsequent recovery or decline. Deep learning-based underwater photogrammetry provides a comprehensive solution for automatic large-scale and precise monitoring. It can quickly acquire a large range of underwater coral reef images, and extract information from these coral images through advanced image processing technology and deep learning methods. This procedure has three major components: (a) Generation of 3D models, (b) understanding of relevant corals in the images, and (c) tracking of those models over time and spatial change analysis. This paper focusses on issue (b), it applies five state-of-the-art neural networks to the semantic segmentation of coral images, compares their performance, and proposes a new coral semantic segmentation method. Finally, in order to quantitatively evaluate the performance of neural networks for semantic segmentation in these experiments, this paper uses mean class-wise Intersection over Union (mIoU), the most commonly used accuracy measure in semantic segmentation, as the standard metric. Meanwhile, considering that the coral boundary is very irregular and the evaluation index of IoU is not accurate enough, a new segmentation evaluation index based on boundary quality, Boundary IoU, is also used to evaluate the segmentation effect. The proposed trained network can accurately distinguish living from dead corals, which could reflect the health of the corals in the area of interest. The classification results show that we achieve state-of-the-art performance compared to other methods tested on the dataset provided in this paper on underwater coral images. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. GeoPDNN 1.0: a semi-supervised deep learning neural network using pseudo-labels for three-dimensional shallow strata modelling and uncertainty analysis in urban areas from borehole data.
- Author
-
Guo, Jiateng, Xu, Xuechuang, Wang, Luyuan, Wang, Xulei, Wu, Lixin, Jessell, Mark, Ogarko, Vitaliy, Liu, Zhibin, and Zheng, Yufei
- Subjects
SUPERVISED learning ,DEEP learning ,GEOLOGICAL modeling ,CITIES & towns ,RADIAL basis functions ,SUPPORT vector machines ,GEOLOGICAL surveys ,MACHINE learning - Abstract
Borehole data are essential for conducting precise urban geological surveys and large-scale geological investigations. Traditionally, explicit modelling and implicit modelling have been the primary methods for visualizing borehole data and constructing 3D geological models. However, explicit modelling requires substantial manual labour, while implicit modelling faces problems related to uncertainty analysis. Recently, machine learning approaches have emerged as effective solutions for addressing these issues in 3D geological modelling. Nevertheless, the use of machine learning methods for constructing 3D geological models is often limited by insufficient training data. In this paper, we propose the semi-supervised deep learning using pseudo-labels (SDLP) algorithm to overcome the issue of insufficient training data. Specifically, we construct the pseudo-labels in the training dataset using the triangular irregular network (TIN) method. A 3D geological model is constructed using borehole data obtained from a real building engineering project in Shenyang, Liaoning Province, NE China. Then, we compare the results of the 3D geological model constructed based on SDLP with those constructed by a support vector machine (SVM) method and an implicit Hermite radial basis function (HRBF) modelling method. Compared to the 3D geological models constructed using the HRBF algorithm and the SVM algorithm, the 3D geological model constructed based on the SDLP algorithm better conforms to the sedimentation patterns of the region. The findings demonstrate that our proposed method effectively resolves the issues of insufficient training data when using machine learning methods and the inability to perform uncertainty analysis when using the implicit method. In conclusion, the semi-supervised deep learning method with pseudo-labelling proposed in this paper provides a solution for 3D geological modelling in engineering project areas with borehole data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. A Multi-scale features-based cloud detection method for Suomi-NPP VIIRS day and night imagery.
- Author
-
Li, Jun, Hu, Chengjie, Sheng, Qinghong, Xu, Jiawei, Zhu, Chongrui, and Zhang, Weili
- Subjects
REMOTE sensing ,DEEP learning ,FEATURE extraction ,ECONOMIC recovery - Abstract
Cloud detection is a necessary step before the application of remote sensing images. However, most methods focus on cloud detection in daytime remote sensing images. The ignored nighttime remote sensing images play more and more important role in many fields such as urban monitoring, population estimation and disaster assessment. The radiation intensity similarity between artificial lights and clouds is higher in nighttime remote sensing images than in daytime remote sensing images, which makes it difficult to distinguish artificial lights from clouds. Therefore, this paper proposes a deep learning-based method (MFFCD-Net) to detect clouds for day and nighttime remote sensing images. MFFCD-Net is designed based on the encoder-decoder structure. The encoder adopts Resnet-50 as the backbone network for better feature extraction, and a dilated residual up-sampling module (DR-UP) is designed in the decoder for up-sampling feature maps while enlarging the receptive field. A multi-scale feature extraction fusion module (MFEF) is designed to enhance the ability of the MFFCD-Net to distinguish regular textures of artificial lights and random textures of clouds. An Global Feature Recovery Fusion Module (GFRF Module) is designed to select and fuse the feature in the encoding stage and the feature in the decoding stage, thus to achieve better cloud detection accuracy. This is the first time that a deep learning-based method is designed for cloud detection both in day and nighttime remote sensing images. The experimental results on Suomi-NPP VIIRS DNB images show that MFFCD-Net achieves higher accuracy than baseline methods both in day and nighttime remote sensing images. Results on daytime remote sensing images indicate that MFFCD-Net can obtain better balance on commission and omission rates than baseline methods (92.3% versus 90.5% on F1-score). Although artificial lights introduced strong interference in cloud detection in nighttime remote sensing images, the accuracy values of MFFCD-Net on OA, Precision, Recall, and F1-score are still higher than 90%. This demonstrates that MFFCD-Net can better distinguish artificial lights from clouds than baseline methods in nighttime remote sensing images. The effectiveness of MFFCD-Net proves that it is very promising for cloud detection both in day and nighttime remote sensing images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. A submesoscale eddy identification dataset in the northwest Pacific Ocean derived from GOCI I chlorophyll a data based on deep learning.
- Author
-
Wang, Yan, Chen, Ge, Yang, Jie, Gui, Zhipeng, and Peng, Dehua
- Subjects
OBJECT recognition (Computer vision) ,DIGITAL image processing ,IMAGE intensifiers ,EDDIES ,ENERGY dissipation ,DEEP learning - Abstract
This paper presents a dataset on the identification of submesoscale eddies, derived from high-resolution chlorophyll a data captured by GOCI I in the northwest Pacific Ocean. Our methodology involves a combination of digital image processing, filtering, and object detection techniques, along with a specific chlorophyll a image enhancement procedure to extract essential information about submesoscale eddies. This information includes their time, polarity, geographical coordinates of the eddy center, eddy radius, coordinates of the upper left and lower right corners of the prediction box, area of the eddy's inner ellipse, and confidence score. The dataset spans eight time intervals, ranging from 00:00 to 08:00 (UTC) daily, covering the period from 1 April 2011 to 31 March 2021. A total of 19 136 anticyclonic eddies and 93 897 cyclonic eddies were identified, with a minimum confidence threshold of 0.2. The mean radius of anticyclonic eddies is 24.44 km (range 2.5 to 44.25 km), while that of cyclonic eddies is 12.34 km (range 1.75 to 44 km). This unprecedented hourly resolution dataset on submesoscale eddies offers valuable insights into their distribution, morphology, and energy dissipation. It significantly contributes to our understanding of marine environments, ecosystems, and the improvement of climate model predictions. The dataset is available at 10.5281/zenodo.13989785 (Wang and Yang, 2023). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Automatic detection of instream large wood in videos using deep learning.
- Author
-
Aarnink, Janbert, Beucler, Tom, Vuaridel, Marceline, and Ruiz-Villanueva, Virginia
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,DATA augmentation - Abstract
Instream large wood (i.e., downed trees, branches and roots larger than 1 m in length and 10 cm diameter) has essential geopmorphological and ecological functions supporting the health of river ecosystems. Still, even though its transport during floods may pose a risk, it is rarely observed and, therefore, poorly understood. This paper presents a novel approach to detect pieces of instream wood from video. The approach uses a Convolutional Neural Network to detect wood automatically. We sampled data to represent different wood transport conditions, combining 20 datasets to yield thousands of instream wood images. We designed multiple scenarios using different data subsets with and without data augmentation and analyzed the contribution of each one to the effectiveness of the model using k-fold cross-validation. The mean average precision of the model varies between 35 and 93 percent, and is highly influenced by the quality of the data which it detects. When the image resolution is low, the identified components in the labeled pieces, rather than exhibiting distinct characteristics such as bark or branches, appear more akin to amorphous masses or 'blobs'. We found that the model detects wood with a mean average precision of 67 percent when using a 418 pixels input image resolution. Also, improvements of up to 23 percent could be achieved in some instances and increasing the input resolution raised the weighted mean average precision to 74 percent. We show that the detection performance on a specific dataset is not solely determined by the complexity of the network or the training data. Therefore, the findings of this paper can be used when designing a custom wood detection network. With the growing availability of flood-related videos featuring wood uploaded to the internet, this methodology facilitates the quantification of wood transport across a wide variety of data sources. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Comparison Study of Three Building Regularization Algorithms.
- Author
-
Bulatov, Dimitri, Mousa, Yousif A., and Helmholz, Petra
- Subjects
URBAN planning ,DEEP learning ,PLUG-ins (Computer programs) ,REMOTE sensing ,URBAN renewal - Abstract
Building outline generation and regularization is an ongoing topic in remote sensing applications. The success of the methods used for building outline detection impacts studies that depend on the accuracy of the methods applied, such as urban planning, geospatial analysis, and 3D city modeling. The results of the building outline detection methods can vary due to several factors, such as the area to which they are applied and/or the parameters used for the methods. Since there are well-established deep-learning based software and plugins for building outlining, this paper compares the results of such a method, namely an open-source AI implementation (Mapflow), with the standard non-deep-learning based tools introduced by (Mousa et al., 2019) and (Bulatov et al., 2014). We present a comparative analysis of these two methods for regularizing building outlines in terms of accuracy, efficiency, and robustness in dealing with different levels of buildings' complexity and structures. While the results of (Mousa et al., 2019) and (Bulatov et al., 2014) are comparable and outperform the AI method, (Mousa et al., 2019) is the method least impacted by the different parameters, however, has a higher computing time. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Analyzing the impact of semantic LoD3 building models on image-based vehicle localization.
- Author
-
Bieringer, Antonia, Wysocki, Olaf, Tuttas, Sebastian, Hoegner, Ludwig, and Holst, Christoph
- Subjects
GLOBAL Positioning System ,VEHICLE models ,DEEP learning - Abstract
Numerous navigation applications rely on data from global navigation satellite systems (GNSS), even though their accuracy is compromised in urban areas, posing a significant challenge, particularly for precise autonomous car localization. Extensive research has focused on enhancing localization accuracy by integrating various sensor types to address this issue. This paper introduces a novel approach for car localization, leveraging image features that correspond with highly detailed semantic 3D building models. The core concept involves augmenting positioning accuracy by incorporating prior geometric and semantic knowledge into calculations. The work assesses outcomes using Level of Detail 2 (LoD2) and Level of Detail 3 (LoD3) models, analyzing whether facade-enriched models yield superior accuracy. This comprehensive analysis encompasses diverse methods, including off-the-shelf feature matching and deep learning, facilitating thorough discussion. Our experiments corroborate that LoD3 enables detecting up to 69% more features than using LoD2 models. We believe that this study will contribute to the research of enhancing positioning accuracy in GNSS-denied urban canyons. It also shows a practical application of under-explored LoD3 building models on map-based car positioning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Pole-NN: Few-Shot Classification of Pole-Like Objects in Lidar Point Clouds.
- Author
-
Zhang, Zezheng, Khoshelham, Kourosh, and Shojaei, Davood
- Subjects
POINT cloud ,TRAFFIC signs & signals ,LIDAR ,DEEP learning ,CLASSIFICATION ,TRAINING needs ,GRIDS (Cartography) ,DAYLIGHT - Abstract
In the realm of autonomous systems and smart-city initiatives, accurately detecting and localizing pole-like objects (PLOs) such as electrical poles and traffic signs has become crucial. Despite their significance, the diverse nature of PLOs complicates their accurate recognition. Point cloud data and 3D deep learning models offer a promising approach to PLO localization under varied lighting, addressing issues faced by camera systems. However, the distinct characteristics of different street scenes worldwide require infeasibly extensive training data for satisfactory results because of the nature of deep learning. This prohibitively increases the cost of lidar data capture and annotation. This paper introduces a novel few-shot learning framework for the classification of outdoor point cloud objects, leveraging a minimalistic approach that requires only a single support sample for effective classification. Central to our methodology is the development of Pole-NN, a Non-parametric Network that efficiently distinguishes between various PLOs and other road assets without the need for extensive training datasets traditionally associated with deep learning models. Additionally, we present the Parkville-3D Dataset, an annotated point cloud dataset we have captured and labelled, which addresses the notable scarcity of fine-grained PLO datasets. Our experimental results demonstrate the potential of our approach to utilize the intrinsic spatial relationships within point cloud data, promoting a more efficient and resource-conscious strategy for PLO classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Deep Learning Based Semantic Segmentation for BIM Model Generation from RGB-D Sensors.
- Author
-
Rached, Ishraq, Hajji, Rafika, Landes, Tania, and Haffadi, Rashid
- Subjects
COMPUTER vision ,DETECTORS ,DEEP learning ,SCAN statistic ,POINT cloud ,ACQUISITION of data - Abstract
RGB-D sensors offer a low-cost and promising solution to streamline the generation of BIM models. This paper introduces a framework designed to automate the creation of detailed and semantically rich BIM models from RGB-D data in indoor environments. The framework leverages advanced computer vision and deep learning techniques to overcome the challenges associated with traditional, labour-intensive BIM modeling methods. The results show that the proposed method is robust and accurate, compared to the high-quality statistic laser scanning TLS. Indeed, 58% of the distances measured between the calculated and the reference point cloud produced by TLS were under 5 cm, and 82% of distances were smaller than 7 cm. Furthermore, the framework achieves 100% accuracy in element extraction. Beyond its accuracy, the proposed framework significantly enhances efficiency in both data acquisition and processing. In contrast to the time-consuming process associated with TLS, our approach remarkably reduces the data collection and processing time by factor of height. This highlights the framework's substantial improvements in accuracy and efficiency throughout the BIM generation workflows, making it a streamlined and time-effective solution. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Building-PCC: Building Point Cloud Completion Benchmarks.
- Author
-
Gao, Weixiao, Peters, Ravi, and Stoter, Jantien
- Subjects
POINT cloud ,DEEP learning ,SOURCE code ,CLOUD computing ,EVALUATION methodology - Abstract
With the rapid advancement of 3D sensing technologies, obtaining 3D shape information of objects has become increasingly convenient. Lidar technology, with its capability to accurately capture the 3D information of objects at long distances, has been widely applied in the collection of 3D data in urban scenes. However, the collected point cloud data often exhibit incompleteness due to factors such as occlusion, signal absorption, and specular reflection. This paper explores the application of point cloud completion technologies in processing these incomplete data and establishes a new real-world benchmark Building-PCC dataset, to evaluate the performance of existing deep learning methods in the task of urban building point cloud completion. Through a comprehensive evaluation of different methods, we analyze the key challenges faced in building point cloud completion, aiming to promote innovation in the field of 3D geoinformation applications. Our source code is available at
https://github.com/ tudelft3d/Building-PCC-Building-Point-Cloud-Completion-Benchmarks.git
. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
20. Unit-level LoD2 Building Reconstruction from Satellite-derived Digital Surface Model and Orthophoto.
- Author
-
Gui, Shengxi, Schuegraf, Philipp, Bittner, Ksenia, and Qin, Rongjun
- Subjects
DIGITAL elevation models ,HIGH resolution imaging ,MICROSOFT Surface ,REMOTE-sensing images ,DEEP learning - Abstract
Recent advancements in deep learning have enabled the possibility to identify unit-level building sections from very high resolution satellite images. By learning from the examples, deep models can capture patterns from the low-resolution roof textures to separate building units from duplex buildings. This paper demonstrates that such unit-level segmentation can further advance level of details (LoD)2 modeling. We extend a building boundary regularization method by adapting noisy unit-level segmentation results. Specifically, we propose a novel polygon composition approach to ensure the individually segmented units within a duplex building or dense adjacent buildings are consistent in their shared boundaries. Results of the experiments show that, our unit-level LoD2 modeling has favorably outperformed the state-of-the-art LoD2 modeling results from satellite images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Image-based Deep Learning for the time-dependent prediction of fresh concrete properties.
- Author
-
Meyer, Max, Langer, Amadeus, Mehltretter, Max, Beyer, Dries, Coenen, Max, Schack, Tobias, Haist, Michael, and Heipke, Christian
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,SELF-consolidating concrete ,CONCRETE ,CONCRETE industry ,YIELD stress ,CONCRETE mixing - Abstract
Increasing the degree of digitisation and automation in the concrete production process can play a crucial role in reducing the CO
2 emissions that are associated with the production of concrete. In this paper, a method is presented that makes it possible to predict the properties of fresh concrete during the mixing process based on stereoscopic image sequences of the concretes flow behaviour. A Convolutional Neural Network (CNN) is used for the prediction, which receives the images supported by information on the mix design as input. In addition, the network receives temporal information in the form of the time difference between the time at which the images are taken and the time at which the reference values of the concretes are carried out. With this temporal information, the network implicitly learns the time-dependent behaviour of the concretes properties. The network predicts the slump flow diameter, the yield stress and the plastic viscosity. The time-dependent prediction potentially opens up the pathway to determine the temporal development of the fresh concrete properties already during mixing. This provides a huge advantage for the concrete industry. As a result, countermeasures can be taken in a timely manner. It is shown that an approach based on depth and optical flow images, supported by information of the mix design, achieves the best results. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
22. DEEP-IMAGE-MATCHING: A TOOLBOX FOR MULTIVIEW IMAGE MATCHING OF COMPLEX SCENARIOS.
- Author
-
Morelli, L., Ioli, F., Maiwald, F., Mazzacca, G., Menna, F., and Remondino, F.
- Subjects
IMAGE registration ,DEEP learning ,COMPUTER vision ,HIGH resolution imaging ,INTEGRATED software - Abstract
Finding corresponding points between images is a fundamental step in photogrammetry and computer vision tasks. Traditionally, image matching has relied on hand-crafted algorithms such as SIFT or ORB. However, these algorithms face challenges when dealing with multi-temporal images, varying radiometry and contents as well as significant viewpoint differences. Recently, the computer vision community has proposed several deep learning-based approaches that are trained for challenging illumination and wide viewing angle scenarios. However, they suffer from certain limitations, such as rotations, and they are not applicable to high resolution images due to computational constraints. In addition, they are not widely used by the photogrammetric community due to limited integration with standard photogrammetric software packages. To overcome these challenges, this paper introduces Deep-Image-Matching, an open-source toolbox designed to match images using different matching strategies, ranging from traditional hand-crafted to deep-learning methods (https://github.com/3DOM-FBK/deep-image-matching). The toolbox accommodates high-resolution datasets, e.g. data acquired with full-frame or aerial sensors, and addresses known rotation-related problems of the learned features. The toolbox provides image correspondences outcomes that are directly compatible with commercial and open-source software packages, such as COLMAP and openMVG, for a bundle adjustment. The paper includes also a series of cultural heritage case studies that present challenging conditions where traditional hand-crafted approaches typically fail. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Advancing Arctic sea ice remote sensing with AI and deep learning: now and future.
- Author
-
Li, Wenwen, Hsu, Chia-Yu, and Tedesco, Marco
- Subjects
DEEP learning ,SEA ice ,REMOTE sensing ,ARTIFICIAL intelligence ,BIG data - Abstract
The revolutionary advances of Artificial Intelligence (AI) in the past decade have brought transformative innovation across science and engineering disciplines. Also in the field of Arctic science, we have witnessed an increasing trend in the adoption of AI, especially deep learning, to support the analysis of Arctic big data and facilitate new discoveries. In this paper, we provide a comprehensive review of the applications of deep learning in sea ice remote sensing domains, focusing on problems such as sea ice lead detection, thickness estimation, concentration, sea ice extent forecasting and motion detection as well as sea ice type classification. In addition to discussing these applications, we also summarize technological advances that provide customized deep learning solutions, including new loss functions and learning strategies to better understand sea ice dynamics. To promote the growth of this exciting interdisciplinary field, we further explore several research areas where the Arctic sea ice community can benefit from cutting-edge AI technology. These areas include improving multi-modal deep learning capabilities, enhancing model accuracy in measuring prediction uncertainty, better leveraging AI foundation models, and deepening the integration with physics-based models. We hope that this paper can serve as a cornerstone in the progress of Arctic sea ice research using AI and inspire further advances in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Study on the effect of color space in deep multitask learning neural networks for road segmentation.
- Author
-
Raninen, Jere, Zhu, Lingli, and Hattula, Emilia
- Subjects
COLOR space ,DEEP learning ,CONVOLUTIONAL neural networks ,DATA mining ,REMOTE sensing ,AUTOMOBILE license plates - Abstract
Precise road segmentation is an essential part of many applications related to road information extraction from remote sensing data. The effect of color space on road detection has rarely been studied. In this paper, the effects of different color spaces of aerial images and multitask learning methods were experimented on road segmentation using three deep convolutional neural networks, UNet, DenseU-Net, and RoadVecNet. The color spaces included RGB, HSV, LAB, YCbCr, and YUV. The multitask learning methods adopted in this study involved utilizing multiple inputs, and multiple outputs. Multiple inputs were aerial images from the same area with different color spaces, and multiple outputs were road segmentation and road outline segmentation. As remote sensing data, National Land Survey of Finland's true orthophotos (from 2020), Massachusetts road imagery dataset, and Ottawa dataset were applied. Segmentation masks for National Land Survey of Finland's true orthophotos were extracted from Digiroad vectors with road width information. Road outline masks were generated from the segmentation masks. The studied neural networks were trained with the same data, learning rate, loss function, and optimizer for each color space, and pairs of color spaces. Multiple outputs were experimented with RGB color space. The comparative analysis assessed the performance of various neural networks across different color spaces using the F1-score metric. The experimental findings indicate that the choice of color space has little influence on the results of neural networks Deep learning methods can adapt to different color spaces well. In addition, the use of sharpening and edge enhancement augmentations had a slight effect on the results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. DEVELOPMENT OF A DATABASE FOR BENCHMARK DATASETS IN PHOTOGRAMMETRY AND REMOTE SENSING.
- Author
-
Budde, L. E., Schmidt, J., Javanmard-Ghareshiran, A., Hunger, S., and Iwaszczuk, D.
- Subjects
REMOTE sensing ,DATABASE design ,DEEP learning ,NONRELATIONAL databases ,PHOTOGRAMMETRY ,DATABASES - Abstract
Data are a key component for many applications and methods in the domain of photogrammetry and remote sensing. Especially data-driven approaches such as deep learning rely heavily on available annotated data. The amount of data is increasing significantly every day. However, reference data is not increasing at the same rate and finding relevant data for a specific domain is still difficult. Thus, it is necessary to make existing reference data more accessible to the scientific community as far as possible in order to make optimal use of it. In this paper we provide an overview of the development of our photogrammetry and remote sensing specific Benchmark Metadata Database (BeMeDa). BeMeDa is based on MongoDB, a NoSQL database system. In addition, the development of a user-oriented metadata schema serves for data structuring. BeMeDa enables easy searching of benchmark datasets in the field of photogrammetry and remote sensing. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. TRAINING OF NEURAL NETWORKS TO DECIPHER THE ROAD NETWORK ACCORDING TO SPACE IMAGERY RECEIVED BY THE "RESURS-P".
- Author
-
Kasatikov, N. N., Umarov, S. M., Fadeeva, A. D., and Tolmachev, S. A.
- Subjects
DEEP learning ,DIGITAL twin ,TRAFFIC signs & signals ,TRAFFIC flow - Abstract
Our team has developed a neural network for road recognition on our digital twin, aimed at enhancing transportation-related applications. The neural network is trained on large datasets of road images and utilizes various deep learning architectures and techniques to improve its accuracy and reliability. The embedded neural network can recognize different road features, such as lane markings, road signs, and obstacles, and can identify the location and direction of the road. The integration of this neural network in our digital twin can help optimize transportation-related operations, reduce accidents, and improve overall traffic flow. The developed neural network architecture and training methodology, as well as its performance evaluation on various datasets, are presented in this paper. Additionally, the paper discusses the future directions for research in this area and the potential of the developed neural network for other applications in the digital twin domain. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. AUTOMATIC ROAD CRACK RECOGNITION BASED ON DEEP LEARNING NETWORKS FROM UAV IMAGERY.
- Author
-
Samadzadegan, F., Dadrass Javan, F., Hasanlou, M., Gholamshahi, M., and Ashtari Mahini, F.
- Subjects
DEEP learning ,OBJECT recognition (Computer vision) ,RECOGNITION (Psychology) ,DRONE aircraft ,PAVEMENTS ,INFRASTRUCTURE (Economics) - Abstract
Roads are one of the essential transportation infrastructures that get damaged over time and affect economic development and social activities. Therefore, accurate and rapid recognition of road damage such as cracks is necessary to prevent further damage and repair it in time. The traditional methods for recognizing cracks are using survey vehicles equipped with various sensors, visual inspection of the road surface, and recognition algorithms in image processing. However, performing recognition operations using these methods is associated with high costs and low accuracy and speed. In recent years, the use of deep learning networks in object recognition and visual applications has increased, and these networks have become a suitable alternative to traditional methods. In this paper, the YOLOv4 deep learning network is used to recognize four types of cracks transverse, longitudinal, alligator, and oblique cracks utilizing a set of 2000 RGB visible images. The proposed network with multiple convolutional layers extracts accurate semantic feature maps from input images and classifies road cracks into four classes. This network performs the recognition process with an error of 1% in the training phase and 77% F1-Score, 80% precision, 80% mean average precision (mAP), 77% recall, and 81% intersection over union (IoU) in the testing phase. These results demonstrate the acceptable accuracy and appropriate performance of the model in road crack recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. A daily reconstructed chlorophyll-a dataset in the South China Sea from MODIS using OI-SwinUnet.
- Author
-
Ye, Haibin, Yang, Chaoyu, Dong, Yuan, Tang, Shilin, and Chen, Chuqun
- Subjects
REMOTE sensing ,MESOSCALE eddies ,ORTHOGONAL functions ,DEEP learning ,MISSING data (Statistics) - Abstract
Satellite remote sensing of sea surface chlorophyll products sometimes yields a significant amount of sporadic missing data due to various variables, such as weather conditions and operational failures of satellite sensors. The limited nature of satellite observation data impedes the utilization of satellite data in the domain of marine research. Hence, it is highly important to investigate techniques for reconstructing satellite remote sensing data to obtain spatially and temporally uninterrupted and comprehensive data within the desired area. This approach will expand the potential applications of remote sensing data and enhance the efficiency of data usage. To address this series of problems, based on the demand for research on the ecological effects of multiscale dynamic processes in the South China Sea, this paper combines the advantages of the optimal interpolation (OI) method and SwinUnet and successfully develops a deep-learning model based on the expected variance in data anomalies, called OI-SwinUnet. The OI-SwinUnet method was used to reconstruct the MODIS chlorophyll- a concentration products of the South China Sea from 2013 to 2017. When comparing the performances of the data-interpolating empirical orthogonal function (DINEOF), OI, and Unet approaches, it is evident that the OI-SwinUnet algorithm outperforms the other algorithms in terms of reconstruction. We conduct a reconstruction experiment using different artificial missing patterns to assess the resilience of OI-SwinUnet. Ultimately, the reconstructed dataset was utilized to examine the seasonal variations and geographical distribution of chlorophyll- a concentrations in various regions of the South China Sea. Additionally, the impact of the plume front on the dispersion of phytoplankton in upwelling areas was assessed. The potential use of reconstructed products to investigate the process by which individual mesoscale eddies affect sea surface chlorophyll is also examined. The reconstructed daily chlorophyll- a dataset is freely accessible at 10.5281/zenodo.10478524 (Ye et al., 2024). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Using deep learning to integrate paleoclimate and global biogeochemistry over the Phanerozoic Eon.
- Author
-
Zheng, Dongyu, Merdith, Andrew S., Goddéris, Yves, Donnadieu, Yannick, Gurung, Khushboo, and Mills, Benjamin J. W.
- Subjects
PHANEROZOIC Eon ,DEEP learning ,PALEOCLIMATOLOGY ,DATA structures ,BIOGEOCHEMISTRY ,CARBON cycle - Abstract
Databases of 3D paleoclimate model simulations are increasingly used within global biogeochemical models for the Phanerozoic Eon. This improves the accuracy of the surface processes within the biogeochemical models, but the approach is limited by the availability of large numbers of paleoclimate simulations at different p CO 2 levels and for different continental configurations. In this paper we apply the Frame Interpolation for Large Motion (FILM) deep learning method to a set of Phanerozoic paleoclimate model simulations to upscale their time resolution from one model run every ∼25 million years to one model run every 1 million years (Myr). Testing the method on a 5 Myr time-resolution set of continental configurations and paleoclimates confirms the accuracy of our approach when reconstructing intermediate frames from configurations separated by up to 40 Myr. We then apply the method to upscale the paleoclimate data structure in the SCION climate-biogeochemical model. The interpolated surface temperature and runoff are reasonable and present a logical progression between the original key frames. When updated to use the high-time-resolution climate data structure, the SCION model predicts climate shifts that were not present in the original model outputs due to its previous use of widely spaced datasets and simple linear interpolation. We conclude that a time resolution of ∼10 Myr in Phanerozoic paleoclimate simulations is likely sufficient for investigating the long-term carbon cycle and that deep learning methods may be critical in attaining this time resolution at reasonable computational expense, as well as for developing new fully continuous methods in which 3D continental processes are able to translate over a moving continental surface in deep time. However, the efficacy of deep learning methods in interpolating runoff data, compared to that of paleogeography and temperature, is diminished by the heterogeneous distribution of runoff. Consequently, interpolated climates must be confirmed by running a paleoclimate model if scientific conclusions are to be based directly on them. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Innovative cloud quantification: deep learning classification and finite-sector clustering for ground-based all-sky imaging.
- Author
-
Luo, Jingxuan, Pan, Yubing, Su, Debin, Zhong, Jinhua, Wu, Lingxiao, Zhao, Wei, Hu, Xiaoru, Qi, Zhengchao, Lu, Daren, and Wang, Yinan
- Subjects
ARTIFICIAL neural networks ,DEEP learning ,COMPUTER vision ,CLIMATE research ,METEOROLOGICAL stations - Abstract
Accurate cloud quantification is essential in climate change research. In this work, we construct an automated computer vision framework by synergistically incorporating deep neural networks and finite-sector clustering to achieve robust whole-sky image-based cloud classification, adaptive segmentation and recognition under intricate illumination dynamics. A bespoke YOLOv8 (You Only Look Once 8) architecture attains over 95 % categorical precision across four archetypal cloud varieties curated from extensive annual observations (2020) at a Tibetan highland station. Tailor-made segmentation strategies adapted to distinct cloud configurations, allied with illumination-invariant image enhancement algorithms, effectively eliminate solar interference and substantially boost quantitative performance even in illumination-adverse analysis scenarios. Compared with the traditional threshold analysis method, the cloud quantification accuracy calculated within the framework of this paper is significantly improved. Collectively, the methodological innovations provide an advanced solution to markedly escalate cloud quantification precision levels imperative for climate change research while offering a paradigm for cloud analytics transferable to various meteorological stations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Quantitative study of storm surge risk assessment in an undeveloped coastal area of China based on deep learning and geographic information system techniques: a case study of Double Moon Bay.
- Author
-
Yu, Lichen, Qin, Hao, Huang, Shining, Wei, Wei, Jiang, Haoyu, and Mu, Lin
- Subjects
STORM surges ,GEOGRAPHIC information systems ,DEEP learning ,RISK assessment ,LAND use planning ,OCEAN waves - Abstract
Storm surges are a common natural hazard in China's southern coastal area which usually cause a great loss of human life and financial damages. With the economic development and population concentration of coastal cities, storm surges may result in more impacts and damage in the future. Therefore, it is of vital importance to conduct risk assessment to identify high-risk areas and evaluate economic losses. However, quantitative study of storm surge risk assessment in undeveloped areas of China is difficult, since there is a lack of building character and damage assessment data. Aiming at the problem of data missing in undeveloped areas of China, this paper proposes a methodology for conducting storm surge risk assessment quantitatively based on deep learning and geographic information system (GIS) techniques. Five defined storm surge inundation scenarios with different typhoon return periods are simulated by the coupled FVCOM–SWAN (Finite Volume Coastal Ocean Model–Simulating WAves Nearshore) model, the reliability of which is validated using official measurements. Building footprints of the study area are extracted through the TransUNet deep learning model and remote sensing images, while building heights are obtained through unoccupied aerial vehicle (UAV) measurements. Subsequently, economic losses are quantitatively calculated by combining the adjusted depth–damage functions and overlaying an analysis of the buildings exposed to storm surge inundation. Zoning maps of the study area are provided to illustrate the risk levels according to economic losses. The quantitative risk assessment and zoning maps can help the government to provide storm surge disaster prevention measures and to optimize land use planning and thus to reduce potential economic losses in the coastal area. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Semantic Segmentation of Building Models with Deep Learning in CityGML.
- Author
-
Rashidan, Hanis, Musliman, Ivin Amri, Abdul Rahman, Alias, Coors, Volker, and Buyuksalih, Gurcan
- Subjects
DEEP learning ,ARCHITECTURAL details ,URBAN planning ,URBAN renewal ,DATA quality - Abstract
Semantic segmentation of 3D urban environments plays an important role in urban planning, management, and analysis. This paper presents an exploration of leveraging BuildingGNN, a deep learning framework for semantic segmentation of 3D building models, and the subsequent conversion of semantic labels into CityGML, the standardized format for 3D city models. The study begins with a methodology outlining the acquisition of a labelled dataset from BuildingNet and the necessary preprocessing steps for compatibility with BuildingGNN's architecture. The training process involves deep learning techniques tailored for 3D building structures, yielding insights into model performance metrics such as Intersection over Union (IoU) for several architectural components. Evaluation of the trained model highlights its accuracy and reliability, albeit with challenges observed, particularly in segmenting certain classes like doors. Moreover, the conversion of semantic labels into CityGML format is discussed, emphasizing the importance of data quality and meticulous annotation practices. The experiment as described in the methodology shows that outputs from the BuildingGNN for semantic segmentation can be utilized for the generation of CityGML building elements with some percentage of success. This particular work reveals several challenges such as the identification of individual architectural elements based on geometry groups. We believe that the improvement of the segmentation process could be further investigated in our near future work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Stereo Vision SLAM with SuperPoint and SuperGlue.
- Author
-
Yoon, Si-Won and Park, Soon-Yong
- Subjects
GRAPH neural networks ,BINOCULAR vision ,VISUAL odometry ,DEEP learning ,FEATURE extraction ,OPTICAL flow - Abstract
This paper presents a method for stereo visual odometry and mapping that integrates VINS-Fusion-based visual odometry estimation with deep learning techniques for camera pose tracking and stereo image matching. Traditional approaches in the VINS-Fusion relied on classical methods for feature extraction and matching, which often resulted in inaccuracies in triangulation-based 3D position estimation. These inaccuracies could be mitigated by incorporating IMU-based position estimation, which yielded more accurate odometry estimates compared to using stereo camera only in three-dimensional space. Consequently, the original VINS-stereo algorithm necessitated a tightly-coupled integration of IMU sensor measurements with estimated visual odometry.To address these challenges, our work proposes replacing the traditional feature extraction method used in VINS-Fusion, the Shi-Tomasi (Good Features to Track) technique, with feature extraction via the SuperPoint deep network. This approach has demonstrated promising experimental results. Additionally, we have applied deep learning models to the matching of feature points that project the same three-dimensional point to pixel coordinates in different images. Instead of using the KLT optical flow algorithm previously employed by VINS-Fusion, our proposed method utilizes SuperGlue, a deep graph neural network for graph matching, to improve image tracking and stereo image matching performance. The performance of the proposed algorithm is evaluated using the publicly available EuRoC dataset, providing a comparison with existing algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Advances and Prospects of Deep Learning for Medium-Range Extreme Weather Forecasting.
- Author
-
Olivetti, Leonardo and Messori, Gabriele
- Subjects
DEEP learning ,LONG-range weather forecasting ,WEATHER forecasting ,EXTREME weather - Abstract
In recent years, deep learning models have rapidly emerged as a standalone alternative to physics-based numerical models for medium-range weather forecasting. Several independent research groups claim to have developed deep learning weather forecasts which outperform those from state-of-the-art physics-basics models, and operational implementation of data-driven forecasts appears to be drawing near. Yet, questions remain about the capabilities of deep learning models to provide robust forecasts of extreme weather. This paper provides an overview of recent developments in the field of deep learning weather forecasts, and scrutinises the challenges that extreme weather events pose to leading deep learning models. Lastly, it argues for the need to tailor data-driven models to forecast extreme events, and proposes a foundational workflow to develop such models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. NorSand4AI: A Comprehensive Triaxial Test Simulation Database for NorSand Constitutive Model Materials.
- Author
-
Ozelim, Luan Carlos de Sena Monteiro, Casagrande, Michéle Dal Toé, and Cavalcante, André Luís Brasil
- Subjects
DATABASES ,DEEP learning ,SOIL science ,SOIL classification ,SCIENTIFIC discoveries ,SOIL testing - Abstract
To learn, humans observe and experience the world, collect data, and establish patterns through repetition. In scientific discovery, these patterns and relationships are expressed as laws and equations, data as properties and variables, and observations as events. Data-driven techniques aim to provide an impartial approach to learning using raw data from actual or simulated observations. In soil science, parametric models known as constitutive models are used to represent the behavior of natural and artificial materials. Creating data-driven constitutive models using deep learning techniques requires large and consistent datasets, which are challenging to acquire through experiments. Synthetic data can be generated using a theoretical function, but there is a lack of literature on high-volume and robust datasets of this kind. Digital soil models can be utilized to conduct numerical simulations that produce synthetic results of triaxial tests, which are regarded as the preferred tests for assessing soil's constitutive behavior. Due to its limitations for modeling real sands, the Modified Cam Clay model has been replaced by the NorSand model in some situations where sand-like materials need to be modelled. Therefore, for a material following the NorSand model, the present paper presents a first-of-its-kind database that addresses the size and complexity issues of creating synthetic datasets for nonlinear constitutive modeling of soils by simulating both drained and undrained triaxial tests of 2000 soil types, each subjected to 40 initial test configurations, resulting in a total of 160000 triaxial test results. Each simulation dataset comprises a 4000 × 10 matrix that can be used for general multivariate forecasting benchmarks, in addition to direct geotechnical and soil science applications. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. CRITICAL REFLECTION ON QUANTITATIVE ASSESSMENT OF IMAGE FUSION QUALITY.
- Author
-
Xu, S. and Ehlers, M.
- Subjects
IMAGE fusion ,IMAGE quality analysis ,CRITICAL thinking ,MULTISENSOR data fusion ,IMAGE processing ,IMAGE sensors ,DEEP learning - Abstract
Image fusion technique has been extended its development from multi-sensor fusion, multi-model fusion to multi-focus fusion. More and more advanced techniques such as deep learning have been integrated into the development of image fusion algorithms. However, as an important aspect, fusion quality assessment has been received less attention. This paper intends to reflect on the commonly used indices for quantitative assessment and investigate how they can represent the fusion quality regarding spectral preservation and spatial improvement. We found that image dissimilarities are unavoidable due to the spectral coverage of different image sensors. Image fusion should integrate these dissimilarities when they are representing spatial improvement. Such integration will naturally change the pixel values. However, as the quality indices for the assessment of spectral preservation are measuring image dissimilarities, the integration of spatial information will lead to a low fusion quality assessment. For the evaluation of spatial improvement, the quality indices only work if the spatial details have been lost; however, in the case of spatial details gain, these indices do not reflect them as spatial improvements. Moreover, this paper raises attention to image processing procedures involved in image fusion, including image geo-registration, image clipping and image resampling, which will change image statistics and thereby influence the quality assessment when statistical indices are used. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. BENCHMARKING THE EXTRACTION OF 3D GEOMETRY FROM UAV IMAGES WITH DEEP LEARNING METHODS.
- Author
-
Nex, F., Zhang, N., Remondino, F., Farella, E. M., Qin, R., and Zhang, C.
- Subjects
DEEP learning ,STEREO image ,MONOCULARS ,GEOMETRY ,LIDAR - Abstract
3D reconstruction from single and multi-view stereo images is still an open research topic, despite the high number of solutions proposed in the last decades. The surge of deep learning methods has then stimulated the development of new methods using monocular (MDE, Monocular Depth Estimation), stereoscopic and Multi-View Stereo (MVS) 3D reconstruction, showing promising results, often comparable to or even better than traditional methods. The more recent development of NeRF (Neural Radial Fields) has further triggered the interest for this kind of solution. Most of the proposed approaches, however, focus on terrestrial applications (e.g., autonomous driving or small artefacts 3D reconstructions), while airborne and UAV acquisitions are often overlooked. The recent introduction of new datasets, such as UseGeo has, therefore, given the opportunity to assess how state-of-the-art MDE, MVS and NeRF 3D reconstruction algorithms perform using airborne UAV images, allowing their comparison with LiDAR ground truth. This paper aims to present the results achieved by two MDE, two MVS and two NeRF approaches levering deep learning approaches, trained and tested using the UseGeo dataset. This work allows the comparison with a ground truth showing the current state of the art of these solutions and providing useful indications for their future development and improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. AN INFORMAL ROAD DETECTION NEURAL NETWORK FOR SOCIETAL IMPACT IN DEVELOPING COUNTRIES.
- Author
-
Fabris-Rotelli, I., Wannenburg, A., Maribe, G., Thiede, R., Vogel, M., Coetzee, M., Sethaelo, K., Selahle, E., Debba, P., and Rautenbach, V.
- Subjects
DEVELOPING countries ,DEEP learning ,REMOTE sensing ,DATA quality ,SUSTAINABLE development - Abstract
Roads found in informal settlements arise out of convenience, and are often not recorded or maintained by authorities. This complicates service delivery, sustainable development and crisis mitigation, including management and tracking of COVID-19. We, therefore, aim to extract informal roads in remote sensing images. Existing techniques aiming at the extraction of formal roads are not suitable for the problem due to the complex physical and spectral properties of informal roads. The only existing approaches for informal roads, namely (Nobrega et al., 2006, Thiede et al., 2020), do not consider neural networks as a solution. Neural networks show promise in overcoming these complexities. However, they require a large amount of data to learn, which is currently not available due to the expensive and time-consuming nature of collecting such data. This paper implements a neural network to extract informal roads from a data set digitised by this research group. Data quality is assessed by calculating validity completeness, homogeneity and the V-measure, a measure of consistency, in order to evaluate the overall usability of the dataset for neural network informal road detection. We implement the GANs-UNet model that obtained the highest F1-score in a 2020 review paper (Abdollahi et al., 2020) on the state-of-the-art deep learning models used to extract formal roads. The results indicate that the model is able to extract informal roads successfully in the presence of appropriate training data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. PIXEL-RESOLUTION DTM GENERATION FOR THE LUNAR SURFACE BASED ON A COMBINED DEEP LEARNING AND SHAPE-FROM-SHADING (SFS) APPROACH.
- Author
-
Chen, H., Hu, X., and Oberst, J.
- Subjects
DEEP learning ,LUNAR surface ,LUNAR exploration ,CONVOLUTIONAL neural networks ,DIGITAL elevation models ,SPACE flight to the moon - Abstract
High-resolution Digital Terrain Models (DTMs) of the lunar surface can provide crucial spatial information for lunar exploration missions. In this paper, we propose a method to generate high-quality DTMs based on a synthesis of deep learning and Shape from Shading (SFS) with a Lunar Reconnaissance Orbiter Narrow Angle Camera (LROC NAC) image as well as a coarse-resolution DTM as input. Specifically, we use a Convolutional Neural Network (CNN)-based deep learning architecture to predict initial pixel-resolution DTMs. Then, we use SFS to improve the details of DTMs. The CNN-model is trained based on the dataset with 30, 000 samples, which are formed by stereo-photogrammetry derived DTMs and orthoimages using LROC NAC images as well as the Selenological and Engineering Explorer and LRO Elevation Model (SLDEM). We take Chang'E-3 landing site as an example, and use a 1.6 m resolution LROC NAC image and 5 m resolution stereo-photogrammetry derived DTM as input to test the proposed method. We evaluate our DTMs with those from stereo-photogrammetry and deep learning. The result shows the proposed method can generate 1.6 m resolution high-quality DTMs, which can clearly improve the visibility of details of the initial DTM generated from the deep learning method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. A COMPARATIVE STUDY OF SEVERAL SLFN-BASED CLASSIFICATION ALGORITHMS FOR URBAN AND RURAL LAND USE.
- Author
-
Lin, Y., Xie, G., Zhang, T., Yu, J., Zhang, H., and Cai, J.
- Subjects
DEEP learning ,URBAN land use ,FEEDFORWARD neural networks ,RURAL land use ,ZONING ,CLASSIFICATION algorithms ,MACHINE learning ,SUSTAINABLE urban development - Abstract
In the study of urban sustainable development, accurate classification of land use has become an important basis for monitoring urban dynamic changes. Hence it is necessary to develop the appropriate recognition model for urban-rural land use. Although deep learning algorithms have become a research hotspot in image classification tasks in recent years, and many good results have been achieved. But other machine learning algorithms are not going away. Compared deep learning with machine learning, there are some advantages and disadvantages in data dependence, hardware dependence, feature processing, problem solving methods, execution time, and interpretability, etc. Especially in the classification for remote sensing images, the continuous research and development of traditional machine learning algorithms is still of great significance. In this paper, the performances of several SLFN-based classification algorithms were studied and compared, including ELM, RBF K-ELM, mixed K-ELM, A-ELM and SVM. Extreme Learning Machine (ELM) is a new algorithm for single-hidden-layer feedforward neural network (SLFN). It has simple structure, fast speed and is easy to train. In some applications, however, standard ELM is prone to be overfitting and its performance will be affected seriously when outliers exist. In order to explore the performance of ELM and its improved algorithm for urban-rural land use classification, comparative experiments between three improved ELM algorithms (RBF K-ELM, mixed K-ELM and A-ELM), ELM and SVM with image data from several study areas were performed, and the classification accuracy and efficiency were analysed. The results show that the three improved ELM algorithms perform better than the standard ELM and SVM both in overall accuracy and Kappa coefficient. However, it is worth noting that the computation efficiency of RBF K-ELM and mixed K-ELM decreases greatly with larger image, the time cost is much more than other algorithms. Compared with other algorithms, A-ELM has the advantages of higher Overall Accuracy and less classification time. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. AUTOMATIC TRAINING DATA GENERATION IN DEEP LEARNING-AIDED SEMANTIC SEGMENTATION OF HERITAGE BUILDINGS.
- Author
-
Murtiyoso, A., Matrone, F., Martini, M., Lingua, A., Grussenmeyer, P., and Pierdicca, R.
- Subjects
DEEP learning ,MACHINE learning ,POINT cloud ,HISTORIC buildings ,INTELLIGENT buildings ,GEOMETRIC approach ,COLUMNS - Abstract
In the geomatics domain the use of deep learning, a subset of machine learning, is becoming more and more widespread. In this context, the 3D semantic segmentation of heritage point clouds presents an interesting and promising approach for modelling automation, in light of the heterogeneous nature of historical building styles and features. However, this heterogeneity also presents an obstacle in terms of generating the training data for use in deep learning, hitherto performed largely manually. The current generally low availability of labelled data also presents a motivation to aid the process of training data generation. In this paper, we propose the use of approaches based on geometric rules to automate to a certain degree this task. One object class will be discussed in this paper, namely the pillars class. Results show that the approach managed to extract pillars with satisfactory quality (98.5% of correctly detected pillars with the proposed algorithm). Tests were also performed to use the outputs in a deep learning segmentation setting, with a favourable outcome in terms of reducing the overall labelling time (−66.5%). Certain particularities were nevertheless observed, which also influence the result of the deep learning segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. LEARNING TO SIEVE: PREDICTION OF GRADING CURVES FROM IMAGES OF CONCRETE AGGREGATE.
- Author
-
Coenen, M., Beyer, D., Heipke, C., and Haist, M.
- Subjects
DEEP learning ,CONCRETE ,CONSTRUCTION materials ,FEATURE extraction ,PARTICLE size distribution ,SIEVES - Abstract
A large component of the building material concrete consists of aggregate with varying particle sizes between 0.125 and 32 mm. Its actual size distribution significantly affects the quality characteristics of the final concrete in both, the fresh and hardened states. The usually unknown variations in the size distribution of the aggregate particles, which can be large especially when using recycled aggregate materials, are typically compensated by an increased usage of cement which, however, has severe negative impacts on economical and ecological aspects of the concrete production. In order to allow a precise control of the target properties of the concrete, unknown variations in the size distribution have to be quantified to enable a proper adaptation of the concrete's mixture design in real time. To this end, this paper proposes a deep learning based method for the determination of concrete aggregate grading curves. In this context, we propose a network architecture applying multi-scale feature extraction modules in order to handle the strongly diverse object sizes of the particles. Furthermore, we propose and publish a novel dataset of concrete aggregate used for the quantitative evaluation of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. SAR deep learning sea ice retrieval trained with airborne laser scanner measurements from the MOSAiC expedition.
- Author
-
Kortum, Karl, Singha, Suman, Spreen, Gunnar, Hutter, Nils, Jutila, Arttu, and Haas, Christian
- Subjects
OPTICAL scanners ,AIRBORNE lasers ,SEA ice ,CONVOLUTIONAL neural networks ,LASER measurement ,DEEP learning - Abstract
Automated sea ice charting from synthetic aperture radar (SAR) has been researched for more than a decade, and we are still not close to unlocking the full potential of automated solutions in terms of resolution and accuracy. The central complications arise from ground truth data not being readily available in the polar regions. In this paper, we build a data set from 20 near-coincident x-band SAR acquisitions and as many airborne laser scanner (ALS) measurements from the Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC), between October and May. This data set is then used to assess the accuracy and robustness of five machine-learning-based approaches by deriving classes from the freeboard, surface roughness (standard deviation at 0.5 m correlation length) and reflectance. It is shown that there is only a weak correlation between the radar backscatter and the sea ice topography. Accuracies between 44 % and 66 % and robustness between 71 % and 83 % give a realistic insight into the performance of modern convolutional neural network architectures across a range of ice conditions over 8 months. It also marks the first time algorithms have been trained entirely with labels from coincident measurements, allowing for a probabilistic class retrieval. The results show that segmentation models able to learn from the class distribution perform significantly better than pixel-wise classification approaches by nearly 20 % accuracy on average. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. The Legacy of Sycamore Gap: The Potential of Photogrammetric AI for Reverse Engineering Lost Heritage with Crowdsourced Data.
- Author
-
Morelli, Luca, Mazzacca, Gabriele, Trybała, Pawel, Gaspari, Federica, Ioli, Francesco, Ma, Zhenyu, Remondino, Fabio, Challis, Keith, Poad, Andrew, Turner, Alex, and Mills, Jon P.
- Subjects
HADRIAN'S Wall (England) ,REVERSE engineering ,OPTICAL scanners ,DEEP learning ,SYCAMORES ,ARTIFICIAL intelligence ,IMAGE registration ,POINT cloud - Abstract
The orientation of crowdsourced and multi-temporal image datasets presents a challenging task for traditional photogrammetry. Indeed, traditional image matching approaches often struggle to find accurate and reliable tie points in images that appear significantly different from one another. In this paper, in order to preserve the memory of the Sycamore Gap tree, a symbol of Hadrian's Wall that was felled in an act of vandalism in September 2023, deep-learning-based features trained specifically on challenging image datasets were employed to overcome limitations of traditional matching approaches. We demonstrate how unordered crowdsourced images and UAV videos can be oriented and used for 3D reconstruction purposes, together with a recently acquired terrestrial laser scanner point cloud for scaling and referencing. This allows the memory of the Sycamore Gap tree to live on and exhibits the potential of photogrammetric AI (Artificial Intelligence) for reverse engineering lost heritage. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. ESTATE: A Large Dataset of Under-Represented Urban Objects for 3D Point Cloud Classification.
- Author
-
Bayrak, Onur Can, Ma, Zhenyu, Farella, Elisa Mariarosaria, Remondino, Fabio, and Uzar, Melis
- Subjects
POINT cloud ,MACHINE learning ,DEEP learning ,CLASSIFICATION ,THREE-dimensional imaging ,CITIES & towns - Abstract
Cityscapes contain a variety of objects, each with a particular role in urban administration and development. With the rapid growth and implementation of 3D imaging technology, urban areas are increasingly surveyed with high-resolution point clouds. This technical advancement extensively improves our ability to capture and analyse urban environments and their small objects. Deep learning algorithms for point cloud data have shown considerable capacity in 3D object classification but still face problems with generally under-represented objects (such as light poles or chimneys). This paper introduces the ESTATE dataset (
https://github.com/3DOM-FBK/ESTATE
), which combines available datasets of various sensors, densities, regions, and object types. It includes 13 classes featuring intensity and/or colour attributes. Tests using ESTATE demonstrate that the dataset improves the classification performance of deep learning techniques and could be a game-changer to advance in the 3D classification of urban objects. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
46. Semantic Segmentation Uncertainty Assessment of Different U-net Architectures for Extracting Building Footprints.
- Author
-
Haghighi Gashti, Ehsan, Delavar, Mahmoud Reza, Guan, Haiyan, and Li, Jonathan
- Subjects
URBAN planning ,DEEP learning ,CITIES & towns ,MACHINE learning ,ENVIRONMENTAL monitoring - Abstract
Automatic extraction of building footprints from aerial and space imageries has been found ever increasing importance in urban planning, disaster management, and environmental monitoring. However, achieving accurate building footprint extraction poses significant challenges due to diverse building characteristics and their similarities to their background elements. While conventional methods in building footprint extraction have mainly relied on image processing techniques, recent advancements in deep learning, particularly semantic segmentation algorithms like U-Net, have shown promise in addressing these challenges through machine learning. This study explores different depths of the U-Net model for building footprint extraction, aiming to identify the optimum architecture while investigating the semantic uncertainty of the building footprint extraction. Utilizing aerial imagery from cities including Berlin, Paris, Chicago, and Zurich, collected from Google Maps and OpenStreetMap (OSM) data, five U-Net models have been compared with varying depths. In addition, the impact of dataset sizes and learning rates on model performance has been investigated. Results confirmed that the U-Net-32-1024 model achieves the highest intersection over union (IoU), Accuracy, and F1-score. Moreover, increasing the training dataset size leads to significant improvements in model performance with IoU, Accuracy and F1-score reaching their values of 73.73%, 88.65% and 88.53%. However, challenges remain in accurately delineating buildings in dense urban areas. Nonetheless, our findings demonstrated the effectiveness of U-Net models in building footprint extraction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Advances and prospects of deep learning for medium-range extreme weather forecasting.
- Author
-
Olivetti, Leonardo and Messori, Gabriele
- Subjects
DEEP learning ,LONG-range weather forecasting ,WEATHER forecasting ,EXTREME weather - Abstract
In recent years, deep learning models have rapidly emerged as a stand-alone alternative to physics-based numerical models for medium-range weather forecasting. Several independent research groups claim to have developed deep learning weather forecasts that outperform those from state-of-the-art physics-based models, and operational implementation of data-driven forecasts appears to be drawing near. However, questions remain about the capabilities of deep learning models with respect to providing robust forecasts of extreme weather. This paper provides an overview of recent developments in the field of deep learning weather forecasts and scrutinises the challenges that extreme weather events pose to leading deep learning models. Lastly, it argues for the need to tailor data-driven models to forecast extreme events and proposes a foundational workflow to develop such models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. An Ensemble Learning Framework for Anomaly Detection of Important Geographical Entities.
- Author
-
Li, Haolin, Tian, Jiaojiao, Wang, Shan, Song, Ping, and She, Yi
- Subjects
INTRUSION detection systems (Computer security) ,DEEP learning ,REMOTE sensing - Abstract
Due to the complex landforms and the limited resolution of remote sensing imagery, it is difficult to avoid the problem of incorrectly capturing geographical entities, such as buildings. Therefore, anomaly detection of important geographical entities is of great significance to ensure the authenticity and accuracy of geographical entity data. In this paper, we propose an ensemble learning framework for anomaly detection of geographical entity by aggregating the predicted labels generated by multiple deep learning models. In detail, we explore multiple change detection and semantic segmentation model and fully utilize the advantages of various deep learning neural network architectures. The proposed anomaly detection strategy of buildings has been performed on two benchmark datasets, including WHU Building change detection dataset and LEVIR building change detection dataset, the experimental results prove that the proposed method can achieve a more robust and better performance than using single change detection model in terms of quantitative performance and visual performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Deep learning applied to CO2 power plant emissions quantification using simulated satellite images.
- Author
-
Dumont Le Brazidec, Joffrey, Vanderbecken, Pierre, Farchi, Alban, Broquet, Grégoire, Kuhlmann, Gerrit, and Bocquet, Marc
- Subjects
DEEP learning ,REMOTE-sensing images ,POWER plants ,CARBON dioxide ,CONVOLUTIONAL neural networks ,GREENHOUSE gases ,AIR pollutants - Abstract
The quantification of emissions of greenhouse gases and air pollutants through the inversion of plumes in satellite images remains a complex problem that current methods can only assess with significant uncertainties. The anticipated launch of the CO2M (Copernicus Anthropogenic Carbon Dioxide Monitoring) satellite constellation in 2026 is expected to provide high-resolution images of CO2 (carbon dioxide) column-averaged mole fractions (XCO2), opening up new possibilities. However, the inversion of future CO2 plumes from CO2M will encounter various obstacles. A challenge is the low CO2 plume signal-to-noise ratio due to the variability in the background and instrumental errors in satellite measurements. Moreover, uncertainties in the transport and dispersion processes further complicate the inversion task. To address these challenges, deep learning techniques, such as neural networks, offer promising solutions for retrieving emissions from plumes in XCO2 images. Deep learning models can be trained to identify emissions from plume dynamics simulated using a transport model. It then becomes possible to extract relevant information from new plumes and predict their emissions. In this paper, we develop a strategy employing convolutional neural networks (CNNs) to estimate the emission fluxes from a plume in a pseudo- XCO2 image. Our dataset used to train and test such methods includes pseudo-images based on simulations of hourly XCO2 , NO2 (nitrogen dioxide), and wind fields near various power plants in eastern Germany, tracing plumes from anthropogenic and biogenic sources. CNN models are trained to predict emissions from three power plants that exhibit diverse characteristics. The power plants used to assess the deep learning model's performance are not used to train the model. We find that the CNN model outperforms state-of-the-art plume inversion approaches, achieving highly accurate results with an absolute error about half of that of the cross-sectional flux method and an absolute relative error of ∼ 20 % when only the XCO2 and wind fields are used as inputs. Furthermore, we show that our estimations are only slightly affected by the absence of NO2 fields or a detection mechanism as additional information. Finally, interpretability techniques applied to our models confirm that the CNN automatically learns to identify the XCO2 plume and to assess emissions from the plume concentrations. These promising results suggest a high potential of CNNs in estimating local CO2 emissions from satellite images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. AUTOMATIC POINT CLOUD NOISE MASKING IN CLOSE RANGE PHOTOGRAMMETRY FOR BUILDINGS USING AI-BASED SEMANTIC LABELLING.
- Author
-
Murtiyoso, A. and Grussenmeyer, P.
- Subjects
AUDITORY masking ,DEEP learning ,POINT cloud ,ARTIFICIAL intelligence ,PHOTOGRAMMETRY ,IMAGE registration ,BUILDING repair - Abstract
The use of AI in semantic segmentation has grown significantly in recent years, aided by developments in computing power and the availability of annotated images for training data. However, in the context of close-range photogrammetry, although working with 2D images, AI is still used mostly for 3D point cloud segmentation purposes. In this paper, we propose a simple method to apply such methods in close range photogrammetry by benefitting from deep learning-based semantic segmentation. Specifically, AI was used to detect unwanted objects in a scene involving the 3D reconstruction of a historical building façade. For these purposes, classes e.g., sky, trees, and electricity poles were considered as noise. Masks were then created from the results which would then constraint the dense image matching process to only the wanted classes. In this regard, the resulting dense point cloud essentially projected the 2D semantic labels into the 3D space, thus excluding noise and unwanted object classes from the 3D scene. Our results were compared to manual image masking and managed to achieve comparable results while requiring only a fraction of the processing time when using a pre-trained DL network to do the task. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.