132 results
Search Results
2. Vision-Based Multi-Stages Lane Detection Algorithm.
- Author
-
Faizi, Fayez Saeed and Al-sulaifanie, Ahmed Khorsheed
- Subjects
CONVOLUTIONAL neural networks ,DRIVERLESS cars ,AUTONOMOUS vehicles ,ALGORITHMS - Abstract
Lane detection is an essential task for autonomous vehicles. Deep learning-based lane detection methods are leading development in this sector. This paper proposes an algorithm named Deep Learning-based Lane Detection (DLbLD), a Convolutional Neural Network (CNN)-based lane detection algorithm. The presented paradigm deploys CNN to detect line features in the image block, predict a point on the lane line part, and project all the detected points for each frame into one-dimensional form before applying K-mean clustering to assign points to related lane lines. Extensive tests on different benchmarks were done to evaluate the performance of the proposed algorithm. The results demonstrate that the introduced DLbLD scheme achieves state-of-the-art performance, where F1 scores of 97.19 and 79.02 have been recorded for TuSimple and CU-Lane benchmarks, respectively. Nevertheless, results indicate the high accuracy of the proposed algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Enhancing automated vehicle identification by integrating YOLO v8 and OCR techniques for high-precision license plate detection and recognition.
- Author
-
Moussaoui, Hanae, Akkad, Nabil El, Benslimane, Mohamed, El-Shafai, Walid, Baihan, Abdullah, Hewage, Chaminda, and Rathore, Rajkumar Singh
- Subjects
AUTOMOBILE license plates ,PATTERN recognition systems ,AUTONOMOUS vehicles ,COMPUTER vision ,TEXT recognition ,DEEP learning ,IDENTIFICATION - Abstract
Vehicle identification systems are vital components that enable many aspects of contemporary life, such as safety, trade, transit, and law enforcement. They improve community and individual well-being by increasing vehicle management, security, and transparency. These tasks entail locating and extracting license plates from images or video frames using computer vision and machine learning techniques, followed by recognizing the letters or digits on the plates. This paper proposes a new license plate detection and recognition method based on the deep learning YOLO v8 method, image processing techniques, and the OCR technique for text recognition. For this, the first step was the dataset creation, when gathering 270 images from the internet. Afterward, CVAT (Computer Vision Annotation Tool) was used to annotate the dataset, which is an open-source software platform made to make computer vision tasks easier to annotate and label images and videos. Subsequently, the newly released Yolo version, the Yolo v8, has been employed to detect the number plate area in the input image. Subsequently, after extracting the plate the k-means clustering algorithm, the thresholding techniques, and the opening morphological operation were used to enhance the image and make the characters in the license plate clearer before using OCR. The next step in this process is using the OCR technique to extract the characters. Eventually, a text file containing only the character reflecting the vehicle's country is generated. To ameliorate the efficiency of the proposed approach, several metrics were employed, namely precision, recall, F1-Score, and CLA. In addition, a comparison of the proposed method with existing techniques in the literature has been given. The suggested method obtained convincing results in both detection as well as recognition by obtaining an accuracy of 99% in detection and 98% in character recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. A Review of Deep Learning Advancements in Road Analysis for Autonomous Driving.
- Author
-
Botezatu, Adrian-Paul, Burlacu, Adrian, and Orhei, Ciprian
- Subjects
CONVOLUTIONAL neural networks ,AUTONOMOUS vehicles ,DEEP learning ,PAVEMENTS - Abstract
The rapid advancement of autonomous vehicle technology has brought into focus the critical need for enhanced road safety systems, particularly in the areas of road damage detection and surface classification. This paper explores these two essential components, highlighting their importance in autonomous driving. In the domain of road damage detection, this study explores a range of deep learning methods, particularly focusing on one-stage and two-stage detectors. These methodologies, including notable ones like YOLO and SSD for one-stage detection and Faster R-CNN for two-stage detection, are critically analyzed for their efficacy in identifying various road damages under diverse conditions. The review provides insights into their comparative advantages, balancing between real-time processing and accuracy in damage localization. For road surface classification, the paper investigates the classification techniques based on both environmental conditions and material road composition. It highlights the role of different convolutional neural network architectures and innovations at the neural level in enhancing classification accuracy under varying road and weather conditions. The main finding of this work is that it offers a comprehensive overview of the current state of the art, showcasing significant strides in utilizing deep learning for road analysis in autonomous vehicle systems. The study concludes by underscoring the importance of continued research in these areas to further refine and improve the safety and efficiency of autonomous driving. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Enhancing Autonomous Vehicle Perception in Adverse Weather: A Multi Objectives Model for Integrated Weather Classification and Object Detection.
- Author
-
Aloufi, Nasser, Alnori, Abdulaziz, and Basuhail, Abdullah
- Subjects
OBJECT recognition (Computer vision) ,GENERATIVE adversarial networks ,SEVERE storms ,DEEP learning ,WEATHER - Abstract
Robust object detection and weather classification are essential for the safe operation of autonomous vehicles (AVs) in adverse weather conditions. While existing research often treats these tasks separately, this paper proposes a novel multi objectives model that treats weather classification and object detection as a single problem using only the AV camera sensing system. Our model offers enhanced efficiency and potential performance gains by integrating image quality assessment, Super-Resolution Generative Adversarial Network (SRGAN), and a modified version of You Only Look Once (YOLO) version 5. Additionally, by leveraging the challenging Detection in Adverse Weather Nature (DAWN) dataset, which includes four types of severe weather conditions, including the often-overlooked sandy weather, we have conducted several augmentation techniques, resulting in a significant expansion of the dataset from 1027 images to 2046 images. Furthermore, we optimize the YOLO architecture for robust detection of six object classes (car, cyclist, pedestrian, motorcycle, bus, truck) across adverse weather scenarios. Comprehensive experiments demonstrate the effectiveness of our approach, achieving a mean average precision (mAP) of 74.6%, underscoring the potential of this multi objectives model to significantly advance the perception capabilities of autonomous vehicles' cameras in challenging environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. ONWARD AND AUTONOMOUSLY: EXPANDING THE HORIZON OF IMAGE SEGMENTATION FOR SELF-DRIVING CARS THROUGH MACHINE LEARNING.
- Author
-
RAVITEJA, TIRUMALAPUDI, M., NANDA KUMAR, and J., SIRISHA
- Subjects
CONVOLUTIONAL neural networks ,MACHINE learning ,AUTONOMOUS vehicles ,DEEP learning ,DRIVERLESS cars ,IMAGE processing ,IMAGE segmentation ,OBJECT recognition (Computer vision) - Abstract
Autonomous navigation is the leading technology in current era, in this intelligent traffic light, sign detection, ADAS and obstacle detections were playing major role. Image segmentation is the process of dividing an image into different regions, or semantic classes. This is a challenging problem in autonomous vehicle technology because it requires the vehicle to be able to understand its surroundings to safely navigate. The major challenges in this platform are the accuracy and efficiency of model performance. The proposed method in the abstract uses a convolutional neural network (CNN) to perform image segmentation. CNNs are a type of deep learning model that is well-suited for image processing tasks. The CNN in this paper was trained on a local city dataset, and it was able to achieve a mean intersection over union (IoU) of 73%. IoU is a measure of how well the segmentation results match the ground truth labels. A score of 100% indicates that the segmentation is perfect, while a score of 0% indicates that the segmentation is completely wrong. This means that the method can segment images at a very fast rate, which is important for autonomous vehicles that need to make real-time decisions. Overall, the proposed method is a promising approach for image segmentation in autonomous vehicles. It can achieve high accuracy and speed, and it is easy to implement using Python. The proposed method attains an accuracy of 98.34 %, a Sensitivity of 97.26 % and a sensitivity of 96.37 % had been attained. The method could be used to improve the safety and efficiency of autonomous vehicles by enabling them to better understand their surroundings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. A Comprehensive Survey on Deep Learning Multi-Modal Fusion: Methods, Technologies and Applications.
- Author
-
Jiao, Tianzhe, Guo, Chaopeng, Feng, Xiaoyue, Chen, Yuming, and Song, Jie
- Subjects
MULTISENSOR data fusion ,HUMAN-computer interaction ,SENTIMENT analysis ,JUDGMENT (Psychology) ,AUTONOMOUS vehicles ,DEEP learning - Abstract
Multi-modal fusion technology gradually become a fundamental task in many fields, such as autonomous driving, smart healthcare, sentiment analysis, and human-computer interaction. It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities. Under complex scenes, multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions. However, achieving outstanding performance is challenging because of equipment performance limitations, missing information, and data noise. This paper comprehensively reviews existing methods based on multi-modal fusion techniques and completes a detailed and in-depth analysis. According to the data fusion stage, multi-modal fusion has four primary methods: early fusion, deep fusion, late fusion, and hybrid fusion. The paper surveys the three major multi-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields. Finally, it discusses the challenges and explores potential research opportunities. Multi-modal tasks still need intensive study because of data heterogeneity and quality. Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology. Invalid data fusion methods may introduce extra noise and lead to worse results. This paper provides a comprehensive and detailed summary in response to these challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Global Navigation Satellite System/Inertial Measurement Unit/Camera/HD Map Integrated Localization for Autonomous Vehicles in Challenging Urban Tunnel Scenarios.
- Author
-
Tao, Lu, Zhang, Pan, Gao, Kefu, and Liu, Jingnan
- Subjects
GLOBAL Positioning System ,RAILROAD tunnels ,TUNNELS ,AUTONOMOUS vehicles ,UNITS of measurement ,CAMERAS - Abstract
Lane-level localization is critical for autonomous vehicles (AVs). However, complex urban scenarios, particularly tunnels, pose significant challenges to AVs' localization systems. In this paper, we propose a fusion localization method that integrates multiple mass-production sensors, including Global Navigation Satellite Systems (GNSSs), Inertial Measurement Units (IMUs), cameras, and high-definition (HD) maps. Firstly, we use a novel electronic horizon module to assess GNSS integrity and concurrently load the HD map data surrounding the AVs. This map data are then transformed into a visual space to match the corresponding lane lines captured by the on-board camera using an improved BiSeNet. Consequently, the matched HD map data are used to correct our localization algorithm, which is driven by an extended Kalman filter that integrates multiple sources of information, encompassing a GNSS, IMU, speedometer, camera, and HD maps. Our system is designed with redundancy to handle challenging city tunnel scenarios. To evaluate the proposed system, real-world experiments were conducted on a 36-kilometer city route that includes nine consecutive tunnels, totaling near 13 km and accounting for 35% of the entire route. The experimental results reveal that 99% of lateral localization errors are less than 0.29 m, and 90% of longitudinal localization errors are less than 3.25 m, ensuring reliable lane-level localization for AVs in challenging urban tunnel scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Vehicle Detection in Adverse Weather: A Multi-Head Attention Approach with Multimodal Fusion.
- Author
-
Tabassum, Nujhat and El-Sharkawy, Mohamed
- Subjects
TRANSFORMER models ,OBJECT recognition (Computer vision) ,WEATHER ,AUTONOMOUS vehicles - Abstract
In the realm of autonomous vehicle technology, the multimodal vehicle detection network (MVDNet) represents a significant leap forward, particularly in the challenging context of weather conditions. This paper focuses on the enhancement of MVDNet through the integration of a multi-head attention layer, aimed at refining its performance. The integrated multi-head attention layer in the MVDNet model is a pivotal modification, advancing the network's ability to process and fuse multimodal sensor information more efficiently. The paper validates the improved performance of MVDNet with multi-head attention through comprehensive testing, which includes a training dataset derived from the Oxford Radar RobotCar. The results clearly demonstrate that the multi-head MVDNet outperforms the other related conventional models, particularly in the average precision (AP) of estimation, under challenging environmental conditions. The proposed multi-head MVDNet not only contributes significantly to the field of autonomous vehicle detection but also underscores the potential of sophisticated sensor fusion techniques in overcoming environmental limitations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Research on a Personalized Decision Control Algorithm for Autonomous Vehicles Based on the Reinforcement Learning from Human Feedback Strategy.
- Author
-
Li, Ning and Chen, Pengzhan
- Subjects
LEARNING ,REINFORCEMENT learning ,AUTONOMOUS vehicles ,ALGORITHMS ,DEEP learning - Abstract
To address the shortcomings of previous autonomous decision models, which often overlook the personalized features of users, this paper proposes a personalized decision control algorithm for autonomous vehicles based on RLHF (reinforcement learning from human feedback). The algorithm combines two reinforcement learning approaches, DDPG (Deep Deterministic Policy Gradient) and PPO (proximal policy optimization), and divides the control scheme into three phases including pre-training, human evaluation, and parameter optimization. During the pre-training phase, an agent is trained using the DDPG algorithm. In the human evaluation phase, different trajectories generated by the DDPG-trained agent are scored by individuals with different styles, and the respective reward models are trained based on the trajectories. In the parameter optimization phase, the network parameters are updated using the PPO algorithm and the reward values given by the reward model to achieve personalized autonomous vehicle control. To validate the control algorithm designed in this paper, a simulation scenario was built using CARLA_0.9.13 software. The results demonstrate that the proposed algorithm can provide personalized decision control solutions for different styles of people, satisfying human needs while ensuring safety. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. A Multi-Sensor 3D Detection Method for Small Objects.
- Author
-
Zhao, Yuekun, Luo, Suyun, Huang, Xiaoci, and Wei, Dan
- Subjects
OBJECT recognition (Computer vision) ,IMAGE fusion ,POINT cloud ,LIDAR - Abstract
In response to the limited accuracy of current three-dimensional (3D) object detection algorithms for small objects, this paper presents a multi-sensor 3D small object detection method based on LiDAR and a camera. Firstly, the LiDAR point cloud is projected onto the image plane to obtain a depth image. Subsequently, we propose a cascaded image fusion module comprising multi-level pooling layers and multi-level convolution layers. This module extracts features from both the camera image and the depth image, addressing the issue of insufficient depth information in the image feature. Considering the non-uniform distribution characteristics of the LiDAR point cloud, we introduce a multi-scale voxel fusion module composed of three sets of VFE (voxel feature encoder) layers. This module partitions the point cloud into grids of different sizes to improve detection ability for small objects. Finally, the multi-level fused point features are associated with the corresponding scale's initial voxel features to obtain the fused multi-scale voxel features, and the final detection results are obtained based on this feature. To evaluate the effectiveness of this method, experiments are conducted on the KITTI dataset, achieving a 3D AP (average precision) of 73.81% for the hard level of cars and 48.03% for the hard level of persons. The experimental results demonstrate that this method can effectively achieve 3D detection of small objects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Learning-Based Hierarchical Decision-Making Framework for Automatic Driving in Incompletely Connected Traffic Scenarios.
- Author
-
Yang, Fan, Li, Xueyuan, Liu, Qi, Li, Xiangyu, and Li, Zirui
- Subjects
DECISION making ,EVIDENCE gaps ,AUTONOMOUS vehicles ,INFORMATION networks ,TIME series analysis - Abstract
The decision-making algorithm serves as a fundamental component for advancing the level of autonomous driving. The end-to-end decision-making algorithm has a strong ability to process the original data, but it has grave uncertainty. However, other learning-based decision-making algorithms rely heavily on ideal state information and are entirely unsuitable for autonomous driving tasks in real-world scenarios with incomplete global information. Addressing this research gap, this paper proposes a stable hierarchical decision-making framework with images as the input. The first step of the framework is a model-based data encoder that converts the input image data into a fixed universal data format. Next is a state machine based on a time series Graph Convolutional Network (GCN), which is used to classify the current driving state. Finally, according to the state's classification, the corresponding rule-based algorithm is selected for action generation. Through verification, the algorithm demonstrates the ability to perform autonomous driving tasks in different traffic scenarios without relying on global network information. Comparative experiments further confirm the effectiveness of the hierarchical framework, model-based image data encoder, and time series GCN. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. MotionTrack: rethinking the motion cue for multiple object tracking in USV videos.
- Author
-
Liang, Zhenqi, Xiao, Gang, Hu, Jianqiu, Wang, Jingshi, and Ding, Chunshan
- Subjects
KALMAN filtering ,VIDEO compression ,VIDEOS ,SOURCE code ,AUTONOMOUS vehicles ,RUNNING speed ,MOTION - Abstract
Multiple object tracking (MOT) in unmanned surface vehicle (USV) videos has many application scenarios in the military and civilian fields. State-of-the-art MOT methods first extract a set of detections from the video frames, then utilize IoU distance to associate the detections of current frame and tracklets of last frame, and finally adopt linear Kalman filter to estimate the current position of tracklets. However, some problems in USV videos seriously affect the tracking performance, such as low frame rate, wobble of observation platform, nonlinear motion of objects, small objects and ambiguous appearance. In this paper, we fully explore the motion cue in USV videos and propose a simple but effective tracker, named MotionTrack. Equipping with YOLOv7 as object detector, the data association of MotionTrack is mainly composed of cascade matching with Gaussian distance module and observation-centric Kalman filter module. We validate the effectiveness with extensive experiments on the recent Jari-Maritime-Tracking-2022 dataset, achieving new state-of-the-art 46.9 MOTA, 49.2 IDF1 with 35.2 FPS running speed on a single 3090 GPU. The source code, pretrained models with deploy versions are released at https://github.com/lzq11/MotionTrack. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. A Vehicle-Edge-Cloud Framework for Computational Analysis of a Fine-Tuned Deep Learning Model.
- Author
-
Khan, M. Jalal, Khan, Manzoor Ahmed, Turaev, Sherzod, Malik, Sumbal, El-Sayed, Hesham, and Ullah, Farman
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,GEOGRAPHICAL perception ,RASPBERRY Pi ,AUTOMATIC timers ,MACHINE learning - Abstract
The cooperative, connected, and automated mobility (CCAM) infrastructure plays a key role in understanding and enhancing the environmental perception of autonomous vehicles (AVs) driving in complex urban settings. However, the deployment of CCAM infrastructure necessitates the efficient selection of the computational processing layer and deployment of machine learning (ML) and deep learning (DL) models to achieve greater performance of AVs in complex urban environments. In this paper, we propose a computational framework and analyze the effectiveness of a custom-trained DL model (YOLOv8) when deployed in diverse devices and settings at the vehicle-edge-cloud-layered architecture. Our main focus is to understand the interplay and relationship between the DL model's accuracy and execution time during deployment at the layered framework. Therefore, we investigate the trade-offs between accuracy and time by the deployment process of the YOLOv8 model over each layer of the computational framework. We consider the CCAM infrastructures, i.e., sensory devices, computation, and communication at each layer. The findings reveal that the performance metrics results (e.g., 0.842 mAP@0.5) of deployed DL models remain consistent regardless of the device type across any layer of the framework. However, we observe that inference times for object detection tasks tend to decrease when the DL model is subjected to different environmental conditions. For instance, the Jetson AGX (non-GPU) outperforms the Raspberry Pi (non-GPU) by reducing inference time by 72%, whereas the Jetson AGX Xavier (GPU) outperforms the Jetson AGX ARMv8 (non-GPU) by reducing inference time by 90%. A complete average time comparison analysis for the transfer time, preprocess time, and total time of devices Apple M2 Max, Intel Xeon, Tesla T4, NVIDIA A100, Tesla V100, etc., is provided in the paper. Our findings direct the researchers and practitioners to select the most appropriate device type and environment for the deployment of DL models required for production. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Irregular boundaries stereo images dataset creating using depth estimation model.
- Author
-
Wahsh, Muntasser A. and Hussain, Zainab M.
- Subjects
DEEP learning ,STEREO image ,COMPUTER vision ,COMPUTER simulation ,APPLICATION software ,AUTONOMOUS vehicles - Abstract
This paper introduces a stereoscopic image and depth dataset created using a deep learning model. It addresses the challenge of obtaining accurate and annotated stereo image pairs with irregular boundaries for deep learning model training. Stereoscopic image and depth dataset provides a unique resource for training deep learning models to handle irregular boundary stereoscopic images, which are valuable for real-world scenarios with complex shapes or occlusions. The dataset is created using monocular depth estimation, a state-of-the-art depth estimation model, and it can be used in applications like rectifying images, estimating depth, detecting objects, and autonomous driving. Overall, this paper presents a novel dataset that demonstrates its effectiveness and potential for advancing stereo vision and developing deep learning models for computer vision applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Survey of Deep Learning-Based Methods for FMCW Radar Odometry and Ego-Localization.
- Author
-
Brune, Marvin, Meisen, Tobias, and Pomp, André
- Subjects
ROAD vehicle radar ,DEEP learning ,AUTONOMOUS vehicles ,MOTION capture (Human mechanics) ,DETECTORS - Abstract
This paper provides an in-depth review of deep learning techniques to address the challenges of odometry and global ego-localization using frequency modulated continuous wave (FMCW) radar sensors. In particular, we focus on the prediction of odometry, which involves the determination of the ego-motion of a system by external sensors, and loop closure detection, which concentrates on the determination of the ego-position typically on an existing map. We initially emphasize the significance of these tasks in the context of radar sensors and underscore the motivations behind them. The subsequent sections delve into the practical implementation of deep learning approaches, strategically designed to effectively address the aforementioned challenges. We primarily focus on spinning and automotive radar configurations within the domain of autonomous driving. Additionally, we introduce publicly available datasets that have been instrumental in addressing these challenges and analyze the importance and struggles of current methods used for radar based odometry and localization. In conclusion, this paper highlights the distinctions between the addressed tasks and other radar perception applications, while also discussing their differences from challenges posed by alternative sensor modalities. The findings contribute to the ongoing discourse on advancing radar sensor capabilities through the application of deep learning methodologies, particularly in the context of enhancing odometry and ego-localization for autonomous driving applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Object Detection in Autonomous Vehicles under Adverse Weather: A Review of Traditional and Deep Learning Approaches.
- Author
-
Tahir, Noor Ul Ain, Zhang, Zuping, Asim, Muhammad, Chen, Junhong, and ELAffendi, Mohammed
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,INTELLIGENT transportation systems ,COMPUTER vision ,PEDESTRIANS ,LITERATURE reviews ,GEOGRAPHICAL perception ,WEATHER ,AUTONOMOUS vehicles - Abstract
Enhancing the environmental perception of autonomous vehicles (AVs) in intelligent transportation systems requires computer vision technology to be effective in detecting objects and obstacles, particularly in adverse weather conditions. Adverse weather circumstances present serious difficulties for object-detecting systems, which are essential to contemporary safety procedures, infrastructure for monitoring, and intelligent transportation. AVs primarily depend on image processing algorithms that utilize a wide range of onboard visual sensors for guidance and decisionmaking. Ensuring the consistent identification of critical elements such as vehicles, pedestrians, and road lanes, even in adverse weather, is a paramount objective. This paper not only provides a comprehensive review of the literature on object detection (OD) under adverse weather conditions but also delves into the ever-evolving realm of the architecture of AVs, challenges for automated vehicles in adverse weather, the basic structure of OD, and explores the landscape of traditional and deep learning (DL) approaches for OD within the realm of AVs. These approaches are essential for advancing the capabilities of AVs in recognizing and responding to objects in their surroundings. This paper further investigates previous research that has employed both traditional and DL methodologies for the detection of vehicles, pedestrians, and road lanes, effectively linking these approaches with the evolving field of AVs. Moreover, this paper offers an in-depth analysis of the datasets commonly employed in AV research, with a specific focus on the detection of key elements in various environmental conditions, and then summarizes the evaluation matrix. We expect that this review paper will help scholars to gain a better understanding of this area of research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Leveraging Perspective Transformation for Enhanced Pothole Detection in Autonomous Vehicles.
- Author
-
Abu-raddaha, Abdalmalek, El-Shair, Zaid A., and Rawashdeh, Samir
- Abstract
Road conditions, often degraded by insufficient maintenance or adverse weather, significantly contribute to accidents, exacerbated by the limited human reaction time to sudden hazards like potholes. Early detection of distant potholes is crucial for timely corrective actions, such as reducing speed or avoiding obstacles, to mitigate vehicle damage and accidents. This paper introduces a novel approach that utilizes perspective transformation to enhance pothole detection at different distances, focusing particularly on distant potholes. Perspective transformation improves the visibility and clarity of potholes by virtually bringing them closer and enlarging their features, which is particularly beneficial given the fixed-size input requirement of object detection networks, typically significantly smaller than the raw image resolutions captured by cameras. Our method automatically identifies the region of interest (ROI)—the road area—and calculates the corner points to generate a perspective transformation matrix. This matrix is applied to all images and corresponding bounding box labels, enhancing the representation of potholes in the dataset. This approach significantly boosts detection performance when used with YOLOv5-small, achieving a 43% improvement in the average precision (AP) metric at intersection-over-union thresholds of 0.5 to 0.95 for single class evaluation, and notable improvements of 34%, 63%, and 194% for near, medium, and far potholes, respectively, after categorizing them based on their distance. To the best of our knowledge, this work is the first to employ perspective transformation specifically for enhancing the detection of distant potholes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. A Method for All-Weather Unstructured Road Drivable Area Detection Based on Improved Lite-Mobilenetv2.
- Author
-
Wang, Qingyu, Lyu, Chenchen, and Li, Yanyan
- Subjects
GEOGRAPHICAL perception ,FEATURE extraction ,AUTONOMOUS vehicles ,WEATHER ,TRANSFER of training ,DEEP learning - Abstract
This paper presents an all-weather drivable area detection method based on deep learning, addressing the challenges of recognizing unstructured roads and achieving clear environmental perception under adverse weather conditions in current autonomous driving systems. The method enhances the Lite-Mobilenetv2 feature extraction module and integrates a pyramid pooling module with an attention mechanism. Moreover, it introduces a defogging preprocessing module suitable for real-time detection, which transforms foggy images into clear ones for accurate drivable area detection. The experiments adopt a transfer learning-based training approach, training an all-road-condition semantic segmentation model on four datasets that include both structured and unstructured roads, with and without fog. This strategy reduces computational load and enhances detection accuracy. Experimental results demonstrate a 3.84% efficiency improvement compared to existing algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. AOHDL: Adversarial Optimized Hybrid Deep Learning Design for Preventing Attack in Radar Target Detection.
- Author
-
Akhtar, Muhammad Moin, Li, Yong, Cheng, Wei, Dong, Limeng, Tan, Yumei, and Geng, Langhuan
- Subjects
RADAR targets ,GENERATIVE adversarial networks ,AUTONOMOUS vehicles ,RADAR ,DETECTORS - Abstract
In autonomous driving, Frequency-Modulated Continuous-Wave (FMCW) radar has gained widespread acceptance for target detection due to its resilience and dependability under diverse weather and illumination circumstances. Although deep learning radar target identification models have seen fast improvement, there is a lack of research on their susceptibility to adversarial attacks. Various spoofing attack techniques have been suggested to target radar sensors by deliberately sending certain signals through specialized devices. In this paper, we proposed a new adversarial deep learning network for spoofing attacks in radar target detection (RTD). Multi-level adversarial attack prevention using deep learning is designed for the coherence pulse deep feature map from DAALnet and Range-Doppler (RD) map from TDDLnet. After the discrimination of the attack, optimization of hybrid deep learning (OHDL) integrated with enhanced PSO is used to predict the range and velocity of the target. Simulations are performed to evaluate the sensitivity of AOHDL for different radar environment configurations. RMSE of AOHDL is almost the same as OHDL without attack conditions and it outperforms the earlier RTD implementations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. The Road to Safety: A Review of Uncertainty and Applications to Autonomous Driving Perception.
- Author
-
Araújo, Bernardo, Teixeira, João F., Fonseca, Joaquim, Cerqueira, Ricardo, and Beco, Sofia C.
- Subjects
DEEP learning ,AUTONOMOUS vehicles ,TRUST ,ROAD safety measures ,ARTIFICIAL intelligence - Abstract
Deep learning approaches have been gaining importance in several applications. However, the widespread use of these methods in safety-critical domains, such as Autonomous Driving, is still dependent on their reliability and trustworthiness. The goal of this paper is to provide a review of deep learning-based uncertainty methods and their applications to support perception tasks for Autonomous Driving. We detail significant Uncertainty Quantification and calibration methods, and their contributions and limitations, as well as important metrics and concepts. We present an overview of the state of the art of out-of-distribution detection and active learning, where uncertainty estimates are commonly applied. We show how these methods have been applied in the automotive context, providing a comprehensive analysis of reliable AI for Autonomous Driving. Finally, challenges and opportunities for future work are discussed for each topic. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Real-Time Deep Learning Framework for Accurate Speed Estimation of Surrounding Vehicles in Autonomous Driving.
- Author
-
García-Aguilar, Iván, García-González, Jorge, Domínguez, Enrique, López-Rubio, Ezequiel, and Luque-Baena, Rafael M.
- Subjects
CONVOLUTIONAL neural networks ,TRACKING algorithms ,COMPUTER systems ,SYSTEM safety ,REGRESSION analysis ,AUTONOMOUS vehicles - Abstract
Accurate speed estimation of surrounding vehicles is of paramount importance for autonomous driving to prevent potential hazards. This paper emphasizes the critical role of precise speed estimation and presents a novel real-time framework based on deep learning to achieve this from images captured by an onboard camera. The system detects and tracks vehicles using convolutional neural networks and analyzes their trajectories with a tracking algorithm. Vehicle speeds are then accurately estimated using a regression model based on random sample consensus. A synthetic dataset using the CARLA simulator has been generated to validate the presented methodology. The system can simultaneously estimate the speed of multiple vehicles and can be easily integrated into onboard computer systems, providing a cost-effective solution for real-time speed estimation. This technology holds significant potential for enhancing vehicle safety systems, driver assistance, and autonomous driving. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. A survey on deep learning-based spatio-temporal action detection.
- Author
-
Wang, Peng, Zeng, Fanwei, and Qian, Yuntao
- Subjects
- *
COMPUTER vision , *DEEP learning , *AUTONOMOUS vehicles - Abstract
Spatio-temporal action detection (STAD) aims to classify the actions present in a video and localize them in space and time. It has become a particularly active area of research in computer vision because of its explosively emerging real-world applications, such as autonomous driving, visual surveillance and entertainment. Many efforts have been devoted in recent years to build a robust and effective framework for STAD. This paper provides a comprehensive review of the state-of-the-art deep learning-based methods for STAD. First, a taxonomy is developed to organize these methods. Next, the linking algorithms, which aim to associate the frame- or clip-level detection results together to form action tubes, are reviewed. Then, the commonly used benchmark datasets and evaluation metrics are introduced, and the performance of state-of-the-art models is compared. At last, this paper is concluded, and a set of potential research directions of STAD are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Improved traffic sign recognition system (itsrs) for autonomous vehicle based on deep convolutional neural network.
- Author
-
Kheder, Mohammed Qader and Mohammed, Aree Ali
- Subjects
TRAFFIC signs & signals ,CONVOLUTIONAL neural networks ,DEEP learning ,DRIVER assistance systems ,ARTIFICIAL intelligence ,COMPUTER vision ,AUTONOMOUS vehicles - Abstract
Due to the considerable number of deaths and vehicle accidents caused by a driver's inattention, as reported by WHO, automobile manufacturers are aiming to combine advanced driver assistance systems (ADAS) with artificial intelligence algorithms, particularly deep learning and computer vision techniques. One feature that assists drivers is traffic sign recognition, which is a technique that allows vehicles to detect and recognize road signs placed on the road. This can be achieved by the aid of computer vision and Convolutional Neural Networks (CNN). The main aim of this research is to propose and improve a CNN based-model that can be efficiently and accurately applied for embedded applications, this might be accomplished with the help of several preprocessing algorithms. An improved network model called LeNet-5 has been developed for the classification of traffic signs. Furthermore, the proposed model network is trained using both German Traffic Sign Recognition Benchmark (GTSRB) and extended GTSRB (EGTSRB) datasets. According to the test results, the improved LeNet-5 architecture obtained an accuracy of 99.12% on GTSRB and 99.78% on EGTSRB datasets respectively, which has a positive performance compared to other state-of-the-art papers in terms of accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. A Review of Deep Learning-Based LiDAR and Camera Extrinsic Calibration.
- Author
-
Tan, Zhiguo, Zhang, Xing, Teng, Shuhua, Wang, Ling, and Gao, Feng
- Subjects
LIDAR ,MULTISENSOR data fusion ,MOBILE robots ,PARAMETER estimation ,CAMERA calibration ,AUTONOMOUS vehicles ,LASER based sensors ,CAMERAS - Abstract
Extrinsic parameter calibration is the foundation and prerequisite for LiDAR and camera data fusion of the autonomous system. This technology is widely used in fields such as autonomous driving, mobile robots, intelligent surveillance, and visual measurement. The learning-based method is one of the targetless calibrating methods in LiDAR and camera calibration. Due to its advantages of fast speed, high accuracy, and robustness under complex conditions, it has gradually been applied in practice from a simple theoretical model in just a few years, becoming an indispensable and important method. This paper systematically summarizes the research and development of this type of method in recent years. According to the principle of calibration parameter estimation, learning-based calibration algorithms are divided into two categories: accurate calibrating estimation and relative calibrating prediction. The evolution routes and algorithm frameworks of these two types of algorithms are elaborated, and the methods used in the algorithms' steps are summarized. The algorithm mechanism, advantages, limitations, and applicable scenarios are discussed. Finally, we make a summary, pointing out existing research issues and trends for future development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. HP-LSTM: Hawkes Process–LSTM-Based Detection of DDoS Attack for In-Vehicle Network.
- Author
-
Li, Xingyu, Li, Ruifeng, and Liu, Yanchen
- Subjects
DENIAL of service attacks ,MIDDLEWARE ,IN-vehicle computing ,DEEP learning ,AUTONOMOUS vehicles ,AUTOMOBILE industry - Abstract
Connected and autonomous vehicles (CAVs) are advancing at a fast speed with the improvement of the automotive industry, which opens up new possibilities for different attacks. A Distributed Denial-of-Service (DDoS) attacker floods the in-vehicle network with fake messages, resulting in the failure of driving assistance systems and impairment of vehicle control functionalities, seriously disrupting the normal operation of the vehicle. In this paper, we propose a novel DDoS attack detection method for in-vehicle Ethernet Scalable service-Oriented Middleware over IP (SOME/IP), which integrates the Hawkes process with Long Short-Term Memory networks (LSTMs) to capture the dynamic behavioral features of the attacker. Specifically, we employ the Hawkes process to capture features of the DDoS attack, with its parameters reflecting the dynamism and self-exciting properties of the attack events. Subsequently, we propose a novel deep learning network structure, an HP-LSTM block, inspired by the Hawkes process, while employing a residual attention block to enhance the model's detection efficiency and accuracy. Additionally, due to the scarcity of publicly available datasets for SOME/IP, we employed a mature SOME/IP generator to create a dataset for evaluating the validity of the proposed detection model. Finally, extensive experiments were conducted to demonstrate the effectiveness of the proposed DDoS attack detection method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Application of Hybrid Deep Reinforcement Learning for Managing Connected Cars at Pedestrian Crossings: Challenges and Research Directions.
- Author
-
Brunoud, Alexandre, Lombard, Alexandre, Gaud, Nicolas, and Abbas-Turki, Abdeljalil
- Subjects
PEDESTRIAN crosswalks ,REINFORCEMENT learning ,DEEP learning ,AUTONOMOUS vehicles ,MACHINE learning ,VISUALIZATION - Abstract
The autonomous vehicle is an innovative field for the application of machine learning algorithms. Controlling an agent designed to drive safely in traffic is very complex as human behavior is difficult to predict. An individual's actions depend on a large number of factors that cannot be acquired directly by visualization. The size of the vehicle, its vulnerability, its perception of the environment and weather conditions, among others, are all parameters that profoundly modify the actions that the optimized model should take. The agent must therefore have a great capacity for adaptation and anticipation in order to drive while ensuring the safety of users, especially pedestrians, who remain the most vulnerable users on the road. Deep reinforcement learning (DRL), a sub-field that is supported by the community for its real-time learning capability and the long-term temporal aspect of its objectives looks promising for AV control. In a previous article, we were able to show the strong capabilities of a DRL model with a continuous action space to manage the speed of a vehicle when approaching a pedestrian crossing. One of the points that remains to be addressed is the notion of discrete decision-making intrinsically linked to speed control. In this paper, we will present the problems of AV control during a pedestrian crossing, starting with a modelization and a DRL model with hybrid action space adapted to the scalability of a vehicle-to-pedestrian (V2P) encounter. We will also present the difficulties raised by the scalability and the curriculum-based method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Accurate Object Detection on Feature Maps with Multi-Shape Receptive Field.
- Author
-
Pengfei Li, Wei Wei, Yu Yan, Rong Zhu, LokHin Fung, and Muchen Li
- Subjects
OBJECT recognition (Computer vision) ,TRAFFIC signs & signals ,DEEP learning ,INDUSTRIAL robots ,AUTONOMOUS vehicles - Abstract
Object detection has been used in a wide range of industries. For example, in autonomous driving, the task of object detection is to accurately and efficiently identify and locate a large number of predefined classes of object instances (vehicles, pedestrians, traffic signs, etc.) from road videos. In robotics, the industrial robot needs to recognize specific machine elements. In the security field, the camera should accurately recognize people's faces. With the wide application of deep learning, the accuracy and efficiency of object detection have greatly improved, but object detection based on deep learning still faces challenges. Different applications of object detection have different requirements, including highly accurate detection, multi-category object detection, real-time detection, robustness to occlusions, etc. To address the above challenges, based on extensive literature research, this paper analyzes methods for improving and optimizing mainstream object detection algorithms from the perspective of evolution of one-stage and two-stage object detection algorithms. Furthermore, this article proposes methods for improving object detection accuracy from the perspective of changing receptive fields. The new model is based on the original YOLOv5 (You Look Only Once) with some modifications. The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers. As a result, the accuracy of the algorithm is improved while ensuring speed. The performance of the new model in this article is compared with that of the original YOLOv5 model and analyzed by several parameters. In addition, the new model is evaluated under four scenarios. Moreover, a summary and outlook on the problems to be solved and the research directions in the future are presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Rainy Environment Identification Based on Channel State Information for Autonomous Vehicles.
- Author
-
Feng, Jianxin, Li, Xinhui, and Fang, Hui
- Subjects
CONVOLUTIONAL neural networks ,AUTONOMOUS vehicles ,DEEP learning ,WEATHER ,RAINFALL ,WIRELESS communications - Abstract
We introduce an innovative deep learning approach specifically designed for the environment identification of intelligent vehicles under rainy conditions in this paper. In the construction of wireless vehicular communication networks, an innovative approach is proposed that incorporates additional multipath components to simulate the impact of raindrop scattering on the vehicle-to-vehicle (V2V) channel, thereby emulating the channel characteristics of vehicular environments under rainy conditions and an equalization strategy in OFDM-based systems is proposed at the receiver end to counteract channel distortion. Then, a rainy environment identification method for autonomous vehicles is proposed. The core of this method lies in utilizing the Channel State Information (CSI) shared within the vehicular network to accurately identify the diverse rainy environments in which the vehicle operates without relying on traditional sensors. The environmental identification task is considered as a multi-class classification problem and a dedicated Convolutional Neural Network (CNN) model is proposed. This CNN model uses the CSI estimated from CAM exchanged in vehicle-to-vehicle (V2V) communication as training features. Simulation results showed that our method achieved an accuracy rate of 95.7% in recognizing various rainy environments, which significantly surpasses existing classical classification models. Moreover, it only took microseconds to predict with high accuracy, surpassing the performance limitations of traditional sensing systems under adverse weather conditions. This breakthrough ensures that intelligent vehicles can rapidly and accurately adjust driving parameters even in complex weather conditions like rain to autonomous drive safely and reliably. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Object detection in adverse weather condition for autonomous vehicles.
- Author
-
Appiah, Emmanuel Owusu and Mensah, Solomon
- Abstract
As self-driving or autonomous vehicles proliferate in our society, there is a need for their computing vision systems to be able to identify objects accurately, no matter the weather condition. One major concern in computer vision is improving an autonomous car's capacity to discern between the components of its environment under challenging conditions. For instance, inclement weather like fog and rain can corrupt images which eventually affect how well autonomous vehicles navigate and localise themselves. To provide an efficient and effective approach for autonomous vehicles to accurately detect objects during adverse weather conditions. The study employed the combination of two deep learning approaches, namely YOLOv7 and ESRGAN. The use of ESRGAN is to first learn from a set of training data and adjust for the unfavourable weather conditions in the images before the YOLOv7 detector performs detection of objects. The use of the ESRGAN allowed for the adaptive enhancement of each image for improved detection performance by the YOLOv7. In both good and bad weather, the employed hybrid approach (YOLOv7 + ESRGAN) works well with about 80% accuracy in detecting all objects during adverse weather conditions. We would recommend further study on the methodology utilised in this paper to tackle the trolley-dilemma problem during inclement weather. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Computer Vision-Based Position Estimation for an Autonomous Underwater Vehicle.
- Author
-
Zalewski, Jacek and Hożyń, Stanisław
- Subjects
AUTONOMOUS underwater vehicles ,MARINE engineering ,ELECTRONIC equipment ,AUTOMOTIVE navigation systems ,SUBMERSIBLES ,COMPUTER vision ,DEEP learning ,GLOBAL Positioning System ,AUTONOMOUS vehicles - Abstract
Autonomous Underwater Vehicles (AUVs) are currently one of the most intensively developing branches of marine technology. Their widespread use and versatility allow them to perform tasks that, until recently, required human resources. One problem in AUVs is inadequate navigation, which results in inaccurate positioning. Weaknesses in electronic equipment lead to errors in determining a vehicle's position during underwater missions, requiring periodic reduction of accumulated errors through the use of radio navigation systems (e.g., GNSS). However, these signals may be unavailable or deliberately distorted. Therefore, in this paper, we propose a new computer vision-based method for estimating the position of an AUV. Our method uses computer vision and deep learning techniques to generate the surroundings of the vehicle during temporary surfacing at the point where it is currently located. The next step is to compare this with the shoreline representation on the map, which is generated for a set of points that are in a specific vicinity of a point determined by dead reckoning. This method is primarily intended for low-cost vehicles without advanced navigation systems. Our results suggest that the proposed solution reduces the error in vehicle positioning to 30–60 m and can be used in incomplete shoreline representations. Further research will focus on the use of the proposed method in fully autonomous navigation systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Collision Risk in Autonomous Vehicles: Classification, Challenges, and Open Research Areas.
- Author
-
Goudarzi, Pejman and Hassanzadeh, Bardia
- Subjects
ADAPTIVE control systems ,OPEN scholarship ,AUTONOMOUS vehicles ,DRIVERLESS cars ,DEEP learning ,MACHINE learning - Abstract
When car following is controlled by human drivers (i.e., by their behavior), the traffic system does not meet stability conditions. In order to ensure the safety and reliability of self-driving vehicles, an additional hazard warning system should be incorporated into the adaptive control system in order to prevent any possible unavoidable collisions. The time to contact is a reasonable indicator of potential collisions. This research examines systems and solutions developed in this field to determine collision times and uses various alarms in self-driving cars that prevent collisions with obstacles. In the proposed analysis, we have tried to classify the various techniques and methods, including image processing, machine learning, deep learning, sensors, and so on, based on the solutions we have investigated. Challenges, future research directions, and open problems in this important field are also highlighted in the paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Deep Learning-Based Vehicle Type and Color Classification to Support Safe Autonomous Driving.
- Author
-
Kim, JongBae
- Subjects
TRAFFIC safety ,CONVOLUTIONAL neural networks ,AUTONOMOUS vehicles ,DEEP learning ,SYSTEM safety - Abstract
This technology can prevent accidents involving large vehicles, such as trucks or buses, by selecting an optimal driving lane for safe autonomous driving. This paper proposes a method for detecting forward-driving vehicles within road images obtained from a vehicle's DashCam. The proposed method also classifies the types and colors of the detected vehicles. The proposed method uses a YOLO deep learning network for vehicle detection based on a pre-trained ResNet-50 convolutional neural network. Additionally, a Resnet-50 CNN-based object classifier, using transfer learning, was used to classify vehicle types and colors. Vehicle types were classified into four categories based on size whereas vehicle colors were classified into eight categories. During autonomous driving, vehicle types are used to determine driving lanes, whereas vehicle colors are used to distinguish the road infrastructure, such as lanes, vehicles, roads, backgrounds, and buildings. The datasets used for learning consisted of road images acquired in various driving environments. The proposed method achieved a vehicle detection accuracy of 91.5%, vehicle type classification accuracy of 93.9%, and vehicle color classification accuracy of 94.2%. It accurately detected vehicles and classified their types and colors. These can be applied to autonomous and safe driving support systems to enhance the safety of autonomous vehicles. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. TSE-UNet: Temporal and Spatial Feature-Enhanced Point Cloud Super-Resolution Model for Mechanical LiDAR.
- Author
-
Ren, Lu, Li, Deyi, Ouyang, Zhenchao, and Zhang, Zhibin
- Subjects
POINT cloud ,MECHANICAL models ,LIDAR ,DEEP learning ,GEOSTATIONARY satellites ,GEOGRAPHICAL perception ,AUTONOMOUS vehicles - Abstract
The mechanical LiDAR sensor is crucial in autonomous vehicles. After projecting a 3D point cloud onto a 2D plane and employing a deep learning model for computation, accurate environmental perception information can be supplied to autonomous vehicles. Nevertheless, the vertical angular resolution of inexpensive multi-beam LiDAR is limited, constraining the perceptual and mobility range of mobile entities. To address this problem, we propose a point cloud super-resolution model in this paper. This model enhances the density of sparse point clouds acquired by LiDAR, consequently offering more precise environmental information for autonomous vehicles. Firstly, we collect two datasets for point cloud super-resolution, encompassing CARLA32-128in simulated environments and Ruby32-128 in real-world scenarios. Secondly, we propose a novel temporal and spatial feature-enhanced point cloud super-resolution model. This model leverages temporal feature attention aggregation modules and spatial feature enhancement modules to fully exploit point cloud features from adjacent timestamps, enhancing super-resolution accuracy. Ultimately, we validate the effectiveness of the proposed method through comparison experiments, ablation studies, and qualitative visualization experiments conducted on the CARLA32-128 and Ruby32-128 datasets. Notably, our method achieves a PSNR of 27.52 on CARLA32-128 and a PSNR of 24.82 on Ruby32-128, both of which are better than previous methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Adaptive Point-Line Fusion: A Targetless LiDAR–Camera Calibration Method with Scheme Selection for Autonomous Driving.
- Author
-
Zhou, Yingtong, Han, Tiansi, Nie, Qiong, Zhu, Yuxuan, Li, Minghu, Bian, Ning, and Li, Zhiheng
- Subjects
CALIBRATION ,LIDAR - Abstract
Accurate calibration between LiDAR and camera sensors is crucial for autonomous driving systems to perceive and understand the environment effectively. Typically, LiDAR–camera extrinsic calibration requires feature alignment and overlapping fields of view. Aligning features from different modalities can be challenging due to noise influence. Therefore, this paper proposes a targetless extrinsic calibration method for monocular cameras and LiDAR sensors that have a non-overlapping field of view. The proposed solution uses pose transformation to establish data association across different modalities. This conversion turns the calibration problem into an optimization problem within a visual SLAM system without requiring overlapping views. To improve performance, line features serve as constraints in visual SLAM. Accurate positions of line segments are obtained by utilizing an extended photometric error optimization method. Moreover, a strategy is proposed for selecting appropriate calibration methods from among several alternative optimization schemes. This adaptive calibration method selection strategy ensures robust calibration performance in urban autonomous driving scenarios with varying lighting and environmental textures while avoiding failures and excessive bias that may result from relying on a single approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A Survey of 6DoF Object Pose Estimation Methods for Different Application Scenarios.
- Author
-
Guan, Jian, Hao, Yingming, Wu, Qingxiao, Li, Sicong, and Fang, Yingjian
- Subjects
SINGLE-degree-of-freedom systems ,DEEP learning ,VIRTUAL reality ,RESEARCH personnel ,AUTONOMOUS vehicles ,COMPUTER vision - Abstract
Recently, 6DoF object pose estimation has become increasingly important for a broad range of applications in the fields of virtual reality, augmented reality, autonomous driving, and robotic operations. This task involves extracting the target area from the input data and subsequently determining the position and orientation of the objects. In recent years, many new advances have been made in pose estimation. However, existing reviews have the problem of only summarizing category-level or instance-level methods, and not comprehensively summarizing deep learning methods. This paper will provide a comprehensive review of the latest progress in 6D pose estimation to help researchers better understanding this area. In this study, the current methods about 6DoF object pose estimation are mainly categorized into two groups: instance-level and category-level groups, based on whether it is necessary to acquire the CAD model of the object. Recent advancements about learning-based 6DoF pose estimation methods are comprehensively reviewed. The study systematically explores the innovations and applicable scenarios of various methods. It provides an overview of widely used datasets, task metrics, and diverse application scenarios. Furthermore, state-of-the-art methods are compared across publicly accessible datasets, taking into account differences in input data types. Finally, we summarize the challenges of current tasks, methods for different applications, and future development directions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. ODSPC: deep learning-based 3D object detection using semantic point cloud.
- Author
-
Song, Shuang, Huang, Tengchao, Zhu, Qingyuan, and Hu, Huosheng
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,POINT cloud ,KALMAN filtering ,MULTISENSOR data fusion ,AUTONOMOUS vehicles - Abstract
Three-dimensional object detection plays a key role in autonomous driving, which becomes extremely challenging in occlusion situations. This paper presents a novel multimodal 3D object detection framework which fuses visual semantic information and depth point cloud information to accurately detect targets with distant object features and occlusion situations. The framework consists of the four steps. Firstly, an improved semantic segmentation network is used to extract semantic information of objects containing similar features. Secondly, semantic images and point clouds are combined to generate pixel-level fusion data so that the semantic information and training capability of sparse and far-point clouds can be improved. Thirdly, a deep learning-based point cloud classification network is used for training of the fused data to output accurate detection frames. Fourthly, an extended Kalman filter is incorporated into point cloud prediction for image-based object detection to further enhance the robustness of object detection. Both Cityscapes and KITTI datasets are used in ablation study and experiments to validate the effectiveness of the proposed framework. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. GA-RCNN:Graph self-attention feature extraction for 3D object detection.
- Author
-
Yi, Yangyang, Yu, Long, Tian, Shengwei, Gao, Xuezhuang, Li, Jie, and Zhao, Xingang
- Subjects
OBJECT recognition (Computer vision) ,POINT cloud ,FEATURE extraction ,AUTONOMOUS vehicles ,DEEP learning - Abstract
In recent years, 3D object detection based on LiDAR point clouds is a key component of autonomous driving. In pursuit of enhancing the accuracy of 3D point cloud feature extraction and point cloud detection, this paper introduces a novel 3D object detection model, termed as Graph Self-Attention-RCNN (GA-RCNN). This model is designed to integrate voxel information and point location information, enhancing the quality of 3D object proposals while maintaining contextual accuracy. The first stage rectifies the previous approach that relied on local features for preselected boxes, overlooking crucial global contextual information. An improved method is suggested in this work, utilizing BEV to capture long-range dependencies via a cross-attention mechanism. The second stage addresses the overreliance on local neighborhood point feature extraction. The Graph Self-Attention Pooling method is proposed, characterized by its dynamic computation of contribution weights for inputs. This enhances the model's flexibility and generalization performance. Extensive evaluations on KITTI and Waymo datasets demonstrate GA-RCNN's superior accuracy compared to other methods, affirming its efficacy in 3D object detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Inter-Frame Compression for Dynamic Point Cloud Geometry Coding.
- Author
-
Akhtar, Anique, Li, Zhu, and Van der Auwera, Geert
- Subjects
POINT cloud ,DEEP learning ,MIXED reality ,GEOMETRY ,VIRTUAL reality ,AUTONOMOUS vehicles ,LATENT variables - Abstract
Efficient point cloud compression is essential for applications like virtual and mixed reality, autonomous driving, and cultural heritage. This paper proposes a deep learning-based inter-frame encoding scheme for dynamic point cloud geometry compression. We propose a lossy geometry compression scheme that predicts the latent representation of the current frame using the previous frame by employing a novel feature space inter-prediction network. The proposed network utilizes sparse convolutions with hierarchical multiscale 3D feature learning to encode the current frame using the previous frame. The proposed method introduces a novel predictor network for motion compensation in the feature domain to map the latent representation of the previous frame to the coordinates of the current frame to predict the current frame’s feature embedding. The framework transmits the residual of the predicted features and the actual features by compressing them using a learned probabilistic factorized entropy model. At the receiver, the decoder hierarchically reconstructs the current frame by progressively rescaling the feature embedding. The proposed framework is compared to the state-of-the-art Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC) schemes standardized by the Moving Picture Experts Group (MPEG). The proposed method achieves more than 88% BD-Rate (Bjøntegaard Delta Rate) reduction against G-PCCv20 Octree, more than 56% BD-Rate savings against G-PCCv20 Trisoup, more than 62% BD-Rate reduction against V-PCC intra-frame encoding mode, and more than 52% BD-Rate savings against V-PCC P-frame-based inter-frame encoding mode using HEVC. These significant performance gains are cross-checked and verified in the MPEG working group. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Multimodal contrastive learning using point clouds and their rendered images.
- Author
-
Lee, Wonyong and Kim, Hyungki
- Subjects
POINT cloud ,SUPERVISED learning ,AUTONOMOUS vehicles ,SCANNING systems ,CLASSIFICATION ,DEEP learning - Abstract
In this paper, we propose a novel unsupervised pre-training method for point cloud deep learning models using multimodal contrastive learning. Point clouds, which consist of a set of three-dimensional coordinate points acquired from 3D scanners, lidars, depth cameras, etc. play an important role in representing 3D scenes, and understanding them is crucial for implementing autonomous driving or navigation. Deep learning models based on supervised learning for point cloud understanding require a label for each point cloud data that corresponds to the correct answer in training. However, generating these labels is expensive, making it difficult to build large datasets, which is essential for good model performance. Our proposed unsupervised pre-training method, on the other hand, does not require labels and can serve as an initial value for a model that can alleviate the need for such large datasets. The proposed method is characterized as a multimodal approach that utilizes two modalities for point clouds: the point cloud itself and an image rendering of the point cloud. By using images that directly render the point clouds, the shape information of the point clouds from various viewpoints can be obtained from the images without additional data such as meshes. We pre-trained the model with the proposed method and conducted performance comparison on ModelNet40 and ScanObjectNN datasets. The linear classification accuracy of the point cloud feature vector extracted by the pre-trained model was 91.5% and 83.9%, and after fine-tuning for each dataset, the classification accuracy was 93.3% and 86.9%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. JUIVCDv1: development of a still-image based dataset for indian vehicle classification.
- Author
-
Maity, Sourajit, Saha, Debam, Singh, Pawan Kumar, and Sarkar, Ram
- Subjects
CONVOLUTIONAL neural networks ,AUTOMATIC classification ,DEEP learning ,TRAFFIC engineering ,AUTONOMOUS vehicles - Abstract
An automatic vehicle classification (AVC) system designed from either still images or videos has the potential to bring significant benefits to the development of a traffic control system. On AVC, numerous articles have been published in the literature. Over the years, researchers in this domain have created and used a variety of datasets, but most often, these datasets may not reflect the exact scenarios of the Indian subcontinent due to specific peculiarities of the road conditions, road congestion nature, and vehicle types usually seen in Indian subcontinent. The primary goal of this paper is to create a new still image dataset, called JUIVCDv1, which contains 12 different local vehicle classes that are collected using mobile cameras in a different way for developing an automated vehicle management system. We have also discussed the characteristics of the current datasets, and various other factors taken into account while creating the dataset for the Indian scenario. Apart from this, we have benchmarked the results on the developed dataset using eight state-of-the-art pre-trained convolutional neural network (CNN) models, namely Xception, InceptionV3, DenseNet121, MobileNetV2, and VGG16, NasNetMobile, ResNet50 and ResNet152. Among these, the Xception, InceptionV3 and DenseNet121 models produce the best classification accuracy scores of 0.94, 0.93 and 0.92 respectively. These models are further utilized to make an ensemble model to enhance the performance of the overall categorization model. Majority voting-based ensemble, Weighted average-based ensemble, and Sum rule-based ensemble approaches are used as ensemble models that give accuracy scores of 0.95, 0.94, and 0.94, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. 基于视觉的相机位姿估计方法.
- Author
-
王静, 王一博, 郭铖, 郭苹, 叶星, and 邢淑军
- Subjects
- *
RESEARCH personnel , *AUTONOMOUS vehicles , *CAMERAS , *DEEP learning , *ROBOTICS - Abstract
Camera pose estimation plays a crucial role in tasks such as autonomous driving and robotics, elucidating the direction and position of the camera in relation to a given scene through the estimation of its positional coordinates and angular deviations around the three coordinate axes. To facilitate the understanding of researchers in the realm of camera pose estimation, this paper comprehensively reviewed the current research status and latest progress in this field will. Firstly, it introduced the fundamental principles, evaluation indicators, and pertinent datasets associated with camera pose estimation. Subsequently, the review elaborated and summarized the two-stage model structure method and single-channel model structure method from the two key technologies of scene relationship construction and camera pose calculation. It conducted classification and analysis based on the diverse core algorithms and scene information employed, with performance comparisons drawn from indoor and outdoor public datasets. Lastly, it expounded the current challenges in the field and future development trends. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Towards autonomous driving: Principles of LiDAR and sensing systems.
- Author
-
Zhang, Chen
- Subjects
- *
DEEP learning , *LIDAR , *AUTONOMOUS vehicles , *DRIVERLESS cars , *PHOTOMETRY , *LASER beams - Abstract
This paper describes the current state of the art in automotive LiDAR technology and its application in sensing systems. In self-driving cars, it is crucial to accurately sense the surroundings and detect the presence of other vehicles, pedestrians and related objects. To improve safety and estimation accuracy, Light Detection and Measurement (LiDAR) systems are being introduced to complement camera-and radar-based perception systems. This article first describes LiDAR systems, analyzing the main components from the laser transmitter to the beam scanning mechanism, and then describes their advantages and disadvantages and current technological solutions. Finally, the article reviews model-based approaches and emerging deep learning solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Data-Driven Solutions for Next-Generation Automotive Cybersecurity
- Author
-
Koduru, Suprabhath, Machina, Siva Prasad, Madichetty, Sreedhar, and Mishra, Sukumar
- Published
- 2024
- Full Text
- View/download PDF
45. A deep learning approach to predicting vehicle trajectories in complex road networks
- Author
-
Sundari, K. and Thilak, A. Senthil
- Published
- 2024
- Full Text
- View/download PDF
46. Implementation of an improved multi-object detection, tracking, and counting for autonomous driving
- Author
-
Albouchi, Adnen, Messaoud, Seifeddine, Bouaafia, Soulef, Hajjaji, Mohamed Ali, and Mtibaa, Abdellatif
- Published
- 2024
- Full Text
- View/download PDF
47. Improving scheduling in multi-AGV systems by task prediction.
- Author
-
Fan, Hongkai, Li, Dong, Ouyang, Bo, Yan, Zhi, and Wang, Yaonan
- Subjects
AUTOMATED guided vehicle systems ,AUTONOMOUS vehicles ,DECISION making ,SCHOOL schedules - Abstract
Automated guided vehicles (AGVs) are driverless robotic vehicles that pick up and deliver materials. Finding ways to improve efficiency while preventing deadlocks is a core issue in designing AGV systems. In this paper, we propose an approach to improve the efficiency of traditional deadlock-free scheduling algorithms. Typically, AGVs have to travel to designated starting locations from their parking locations to execute tasks, the time required for which is referred to as preparation time. The proposed approach aims at reducing the preparation time by predicting the starting locations for future tasks and then making decisions on whether to send an AGV to the predicted starting location of the upcoming task, thus reducing the time spent waiting for an AGV to arrive at the starting location after the upcoming task is created. Cases in which wrong predictions have been made are also addressed in the proposed method. Simulation results show that the proposed method significantly improves efficiency, up to 20–30% as compared with traditional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Moving vehicle tracking and scene understanding: A hybrid approach.
- Author
-
Liu, Xiaoxu, Yan, Wei Qi, and Kasabov, Nikola
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,DEEP learning ,AUTONOMOUS vehicles ,TRAFFIC safety ,ROAD safety measures - Abstract
In this paper, we present a novel deep learning method for detecting and tracking vehicles within the context of autonomous driving, particularly focusing on scenarios related to vehicle failures. Ensuring the precise identification and monitoring of vehicles is paramount for enhancing road safety in autonomous driving systems. Our contribution involves the introduction of a hybrid Siamese network that merges the capabilities of YOLO models with Transformers. This integration aims to address the limitations of Convolutional Neural Networks (CNNs) in grasping high-level semantic nuances, thereby facilitating accurate detection and tracking of multiple vehicles within a given scene. Beyond this, we also curated the traffic scene dataset, which serves as a resource for training a multi-vehicle tracking model specifically tailored to the unique characteristics of traffic environment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Multi-camera trajectory matching based on hierarchical clustering and constraints.
- Author
-
Szűcs, Gábor, Borsodi, Regő, and Papp, Dávid
- Subjects
HIERARCHICAL clustering (Cluster analysis) ,IMAGE recognition (Computer vision) ,OBJECT recognition (Computer vision) ,TRAFFIC monitoring ,DEEP learning ,OBJECT tracking (Computer vision) ,AUTONOMOUS vehicles - Abstract
The fast improvement of deep learning methods resulted in breakthroughs in image classification, object detection, and object tracking. Autonomous driving and traffic monitoring systems, especially the on-premise installed fixed position multi-camera configurations, benefit greatly from recent advances. In this paper, we propose a Multi-Camera Multi-Target (MCMT) vehicle tracking system using a constrained hierarchical clustering solution, which improves trajectory matching, and thus provides a more robust tracking of objects transitioning between cameras. YOLOv5, ByteTrack, and ResNet50-IBN ReID networks are used for vehicle detection and tracking. Static attributes such as vehicle type and vehicle color are determined from ReID features with SVM. The proposed ReID feature-based attribute categorization shows better performance, than its pure CNN counterpart. Single-camera trajectories (SCTs) are combined into multi-camera trajectories (MCTs) using hierarchical agglomerative clustering (HAC) with time and space constraints (our proposed algorithm is denoted by MCT#MAC). Similarities between SCTs are measured by comparing the mean ReID features cumulated on the trajectory. The system was evaluated on more datasets, and our experiments demonstrate that constraining HAC by manipulating the proximity matrix greatly improves the multi-camera IDF1 score. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Real-Time Road Lane Detection for Self-driving Cars Using Computer Vision
- Author
-
Gupta, Meenu, Kumar, Rakesh, Bisht, Archana, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Verma, Om Prakash, editor, Wang, Lipo, editor, Kumar, Rajesh, editor, and Yadav, Anupam, editor
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.