444 results on '"object detector"'
Search Results
2. QPDet: Queuing People Detector for Aerial Images Based on Adaptive Soft Label Assignment Strategy
- Author
-
Zhang, Yi, Su, Yi, Li, Siying, Yi, Kai, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Integrating deep learning in target tracking applications, as enabler of control systems.
- Author
-
Mihalca, Vlad Ovidiu, Moldovan, Ovidiu, Ţarcă, Ianina, Anton, Daniel, and Noje, Dan
- Subjects
MOBILE robot control systems ,OBJECT recognition (Computer vision) ,MOBILE robots ,DETECTORS ,DEEP learning - Abstract
Target tracking is a key component of control systems with applications in various domains. Several examples of commercial applications may be given, which benefit from this technology: the development of car collision avoidance systems which must detect and track potential obstacles and hazards, or the development of UAVs that can track and record the evolution of athletes. For the purpose of our research, this technology was developed and tested as part of a less complex application. The current work describes a simple yet practical implementation of vision-based control for a mobile robot system. In the experiment, we used a mobile robot as target and a similar robot as follower. To achieve the tracking task, the used strategy involves the detection of a specific visual object mounted on the target, by extracting its features which are then used in issuing control commands within a remotely-closed loop. A Deep Learning approach is used for object detection, incorporating a detector model into the strategy while preserving an explicit controller in the overall scheme. The carried experiment has proven that this new approach provides the expected results, which make it a suitable tool for development of larger scale applications of control systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Robust Object Detection Using Fire Hawks Optimizer with Deep Learning Model for Video Surveillance.
- Author
-
Prabu, S. and Gnanasekar, J. M.
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *VIDEO surveillance , *COMPUTER vision , *DEEP learning - Abstract
In recent years, video surveillance has become an integral part of computer vision research, addressing a variety of challenges in security, memory management and content extraction from video sequences. This paper introduces the Robust Object Detection using Fire Hawks Optimizer with Deep Learning (ROD-FHODL) technique, a novel approach designed specifically for video surveillance applications. Combining object detection and classification the proposed technique employs a two-step procedure. Utilizing the power of the Mask Region-based Convolutional Neural Network (Mask-RCNN) for object detection, we optimize its hyperparameters using the Fire Hawks Optimizer (FHO) algorithm to improve its efficacy. Our experimental results on the UCSD dataset demonstrate the significant impact of the proposed work. It achieves an extraordinary RUNNT of 1.34 s on the pedestrian-1 dataset, significantly outperforming existing models. In addition, the proposed system surpasses in accuracy, with a pedestrian-1 accuracy rate of 97.45% and Area Under the Curve (AUC) values of 98.92%. Comparative analysis demonstrates the superiority of the proposed system in True Positive Rate (TPR) versus False Positive Rate (FPR) across thresholds. In conclusion, the proposed system represents a significant advancement in video surveillance, offering advances in speed, precision and robustness that hold promise for enhancing security, traffic management and public space monitoring in smart city infrastructure and other applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Deep learning for automatic calcium detection in echocardiography.
- Author
-
Elvas, Luís B., Gomes, Sara, Ferreira, João C., Rosário, Luís Brás, and Brandão, Tomás
- Subjects
- *
AORTIC valve diseases , *IMAGE recognition (Computer vision) , *NONINVASIVE diagnostic tests , *CONVOLUTIONAL neural networks , *COMPUTER-assisted image analysis (Medicine) - Abstract
Cardiovascular diseases are the main cause of death in the world and cardiovascular imaging techniques are the mainstay of noninvasive diagnosis. Aortic stenosis is a lethal cardiac disease preceded by aortic valve calcification for several years. Data-driven tools developed with Deep Learning (DL) algorithms can process and categorize medical images data, providing fast diagnoses with considered reliability, to improve healthcare effectiveness. A systematic review of DL applications on medical images for pathologic calcium detection concluded that there are established techniques in this field, using primarily CT scans, at the expense of radiation exposure. Echocardiography is an unexplored alternative to detect calcium, but still needs technological developments. In this article, a fully automated method based on Convolutional Neural Networks (CNNs) was developed to detect Aortic Calcification in Echocardiography images, consisting of two essential processes: (1) an object detector to locate aortic valve – achieving 95% of precision and 100% of recall; and (2) a classifier to identify calcium structures in the valve – which achieved 92% of precision and 100% of recall. The outcome of this work is the possibility of automation of the detection with Echocardiography of Aortic Valve Calcification, a lethal and prevalent disease. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. DeepCraftFuse: visual and deeply-learnable features work better together for esophageal cancer detection in patients with Barrett's esophagus.
- Author
-
Souza Jr., Luis A., Pacheco, André G. C., Passos, Leandro A., Santana, Marcos C. S., Mendel, Robert, Ebigbo, Alanna, Probst, Andreas, Messmann, Helmut, Palm, Christoph, and Papa, João Paulo
- Subjects
- *
BARRETT'S esophagus , *ESOPHAGEAL cancer , *CANCER patients , *COMPUTER-aided diagnosis , *EARLY detection of cancer , *DEEP learning , *IDENTIFICATION - Abstract
Limitations in computer-assisted diagnosis include lack of labeled data and inability to model the relation between what experts see and what computers learn. Even though artificial intelligence and machine learning have demonstrated remarkable performances in medical image computing, their accountability and transparency level must be improved to transfer this success into clinical practice. The reliability of machine learning decisions must be explained and interpreted, especially for supporting the medical diagnosis. While deep learning techniques are broad so that unseen information might help learn patterns of interest, human insights to describe objects of interest help in decision-making. This paper proposes a novel approach, DeepCraftFuse, to address the challenge of combining information provided by deep networks with visual-based features to significantly enhance the correct identification of cancerous tissues in patients affected with Barrett's esophagus (BE). We demonstrate that DeepCraftFuse outperforms state-of-the-art techniques on private and public datasets, reaching results of around 95% when distinguishing patients affected by BE that is either positive or negative to esophageal cancer. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. CCTV AI-Based System with Custom Object Detection and Video Upscaling Model
- Author
-
Al-Failakawi, Talal, Landoulsi, Yousef, Al-Maati, Shereef Abu, Chakrabarti, Amlan, Series Editor, Becker, Jürgen, Editorial Board Member, Hu, Yu-Chen, Editorial Board Member, Chattopadhyay, Anupam, Editorial Board Member, Tribedi, Gaurav, Editorial Board Member, Saha, Sriparna, Editorial Board Member, Goswami, Saptarsi, Editorial Board Member, Sharma, Dinesh K., editor, Hota, H. S., editor, and Rasheed Rababaah, Aaron, editor
- Published
- 2024
- Full Text
- View/download PDF
8. Application of Dynamic Graph CNN* and FICP for Detection and Research Archaeology Sites
- Author
-
Vokhmintcev, Aleksandr, Khristodulo, Olga, Melnikov, Andrey, Romanov, Matvei, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ignatov, Dmitry I., editor, Khachay, Michael, editor, Kutuzov, Andrey, editor, Madoyan, Habet, editor, Makarov, Ilya, editor, Nikishina, Irina, editor, Panchenko, Alexander, editor, Panov, Maxim, editor, Pardalos, Panos M., editor, Savchenko, Andrey V., editor, Tsymbalov, Evgenii, editor, Tutubalina, Elena, editor, and Zagoruyko, Sergey, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Adversarial Camera Patch: An Effective and Robust Physical-World Attack on Object Detectors.
- Author
-
Tiliwalidi, Kalibinuer, Bei Hui, Chengyin Hu, and Jingjing Ge
- Abstract
Physical adversarial attacks present a novel and growing challenge in cybersecurity, especially for systems reliant on physical inputs for Deep Neural Networks (DNNs), such as those found in Internet of Things (IoT) devices. They are vulnerable to physical adversarial attacks where real-world objects or environments are manipulated to mislead DNNs, thereby threatening the operational integrity and security of IoT devices. The camera-based attacks are one of the most practical adversarial attacks, which are easy to implement and more robust than all the other attack methods, and pose a big threat to the security of IoT. This paper proposes Adversarial Camera Patch (ADCP), a novel approach that employs a single-camera patch to launch robust physical adversarial attacks against object detectors. ADCP optimizes the physical parameters of the camera patch using Particle Swarm Optimization (PSO) to identify the most adversarial configuration. The optimized camera patch is then attached to the lens to generate stealthy and robust adversarial samples physically. The effectiveness of the proposed approach is validated through ablation experiments in a digital environment, with experimental results demonstrating its effectiveness even under worst-case scenarios (minimal width, maximum transparency). Notably, ADCP exhibits higher robustness in both digital and physical domains compared to the baseline. Given the simplicity, robustness, and stealthiness of ADCP, we advocate for attention towards the ADCP framework as it offers a means to achieve streamlined, robust, and stealthy physical attacks. Our adversarial attacks pose new challenges and requirements for cybersecurity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Real-Time Object Detection and Tracking Based on Embedded Edge Devices for Local Dynamic Map Generation.
- Author
-
Choi, Kyoungtaek, Moon, Jongwon, Jung, Ho Gi, and Suhr, Jae Kyu
- Subjects
OBJECT tracking (Computer vision) ,FEATURE extraction ,SYSTEMS on a chip ,DATABASES ,SYSTEMS design - Abstract
This paper proposes a camera system designed for local dynamic map (LDM) generation, capable of simultaneously performing object detection, tracking, and 3D position estimation. This paper focuses on improving existing approaches to better suit our application, rather than proposing novel methods. We modified the detection head of YOLOv4 to enhance the detection performance for small objects and to predict fiducial points for 3D position estimation. The modified detector, compared to YOLOv4, shows an improvement of approximately 5% mAP on the Visdrone2019 dataset and around 3% mAP on our database. We also proposed a tracker based on DeepSORT. Unlike DeepSORT, which applies a feature extraction network for each detected object, the proposed tracker applies a feature extraction network once for the entire image. To increase the resolution of feature maps, the tracker integrates the feature aggregation network (FAN) structure into the DeepSORT network. The difference in multiple objects tracking accuracy (MOTA) between the proposed tracker and DeepSORT is minimal at 0.3%. However, the proposed tracker has a consistent computational load, regardless of the number of detected objects, because it extracts a feature map once for the entire image. This characteristic makes it suitable for embedded edge devices. The proposed methods have been implemented on a system on chip (SoC), Qualcomm QCS605, using network pruning and quantization. This enables the entire process to be executed at 10 Hz on this edge device. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Deep Learning-Based Object Detection and Classification for Autonomous Vehicles in Different Weather Scenarios of Quebec, Canada
- Author
-
Teena Sharma, Abdellah Chehri, Issouf Fofana, Shubham Jadhav, Siddhartha Khare, Benoit Debaque, Nicolas Duclos-Hindie, and Deeksha Arya
- Subjects
Autonomous vehicles ,convolutional neural networks ,intelligent transportation ,object detector ,surveillance ,YOLOv8 ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The rapid development of self-driving vehicles requires integrating a sophisticated sensing system to address the various obstacles posed by road traffic efficiently. While several datasets are available to support object detection in autonomous vehicles, it is crucial to carefully evaluate the suitability of these datasets for different weather conditions across the globe. In response to this requirement, we present a novel dataset named the Canadian Vehicle Datasets (CVD). Subsequently, we present deep learning models that use this dataset. The CVD comprises street-level videos which were recorded by Thales, Canada. These videos were collected with high-quality cameras mounted on a vehicle in the Canadian province of Quebec. The recordings were made during daytime and nighttime, capturing weather conditions such as hazy, snowy, rainy, gloomy, nighttime and sunny days. A total of 10000 images of vehicles and other road assets are extracted from the collected videos. A total of 8388 images were annotated with corresponding generated labels 27766 with their respective 11 different classes. We analyzed the performance of the YOLOv8 model trained using the existing RoboFlow dataset. Then, we compared it with the model trained on the expanded version of RoboFlow using the proposed weather-specific dataset, CVD. Final values of improved accuracy of 73.26 %, 72.84 %, and 73.47 % (Precision/Recall/mAP) were reported upon adding the proposed dataset. Finally, the model trained on this diverse dataset exhibits heightened robustness and proves highly beneficial for both autonomous and conventional vehicle operations, making it applicable not only in Canada but also in other countries with comparable weather conditions.
- Published
- 2024
- Full Text
- View/download PDF
12. A Comprehensive Survey on Replay Strategies for Object Detection
- Author
-
Shaik, Allabaksh, Basha, Shaik Mahaboob, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Polkowski, Zdzislaw, editor, Correia, Sérgio Duarte, editor, and Virdee, Bal, editor
- Published
- 2023
- Full Text
- View/download PDF
13. DetOH: An Anchor-Free Object Detector with Only Heatmaps
- Author
-
Wu, Ruohao, Xiao, Xi, Hu, Guangwu, Zhao, Hanqing, Zhang, Han, Peng, Yongqing, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yang, Xiaochun, editor, Suhartanto, Heru, editor, Wang, Guoren, editor, Wang, Bin, editor, Jiang, Jing, editor, Li, Bing, editor, Zhu, Huaijie, editor, and Cui, Ningning, editor
- Published
- 2023
- Full Text
- View/download PDF
14. Channel Pruning Method for Anchor-Free Detector
- Author
-
RAN Mengying, YANG Wenzhu, YIN Qunjie
- Subjects
anchor-free ,object detector ,attention module ,channel pruning ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Aiming at the problems of large redundant parameters, high computational cost and slow detection speed of the anchor-free detector, a channel pruning method guided by double attention modules (CPDAM) is proposed to compress the anchor-free object detectors. The performance of the channel attention and spatial attention submodules is further improved using pooling and group normalization. The improved channel attention and spatial attention submodules are fused using a channel grouping strategy and are continuously trained to generate a scale value for each channel indicating the importance of the channel on the classification task. A global scale value is calculated using the scale values and the channel pruning of the backbone network is performed based on the evaluation of channel importance by this value. The improved anchor-free object detector is experimentally validated on PASCAL VOC, ImageNet and CIFAR-100 datasets, and the experimental results show that the number of parameters of CenterNet-ResNet101 before and after pruning is decreased from 6.995×107 to 2.238×107, and the FPS is increased from 27 to 46, with only 0.6 percentage points mAP loss.
- Published
- 2023
- Full Text
- View/download PDF
15. GSA-DLA34: a novel anchor-free method for human-vehicle detection.
- Author
-
Chen, Xinying, Lv, Na, Lv, Shuo, and Zhang, Hao
- Subjects
DETECTORS ,DEEP learning - Abstract
Most anchor-free object detectors suffer from intersample imbalance, underutilization of multiscale features and long training times in traffic object dataset. As a result, the efficiency and accuracy of the detector may be significantly reduced for samples with few categories and small sizes. To address these problems, we propose a novel anchor-free approach, namely, GSA-DLA34, which is based on Gaussian kernel, sample weights, and attention. Its features are as follows. First, pyramid squeeze attention (PSA) is added after the backbone network to enhance multiscale traffic object representations. Second, for better object positioning with few categories and small scales, we design active sample weights for regression loss to make better information use. In addition, an elliptical Gaussian sampling module (EGSM) with a controllable Gaussian kernel shape is incorporated into the classification and regression branches to accelerate network training. The results show that our GSA-DLA34 has a significant advantage in balancing training time, inference speed, and accuracy. With an average precision of 89% on the PASCAL VOC dataset and an inference speed of 55.2 FPS on the RTX 2080 Ti, the GSA-DLA34 method can significantly improve human-vehicle recognition accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Improved Deep Learning-Based Vehicle Detection for Urban Applications Using Remote Sensing Imagery.
- Author
-
Ragab, Mahmoud, Abdushkour, Hesham A., Khadidos, Adil O., Alshareef, Abdulrhman M., Alyoubi, Khaled H., and Khadidos, Alaa O.
- Subjects
- *
REMOTE sensing , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *DEEP learning , *DATABASES - Abstract
Remote sensing (RS) data can be attained from different sources, such as drones, satellites, aerial platforms, or street-level cameras. Each source has its own characteristics, including the spectral bands, spatial resolution, and temporal coverage, which may affect the performance of the vehicle detection algorithm. Vehicle detection for urban applications using remote sensing imagery (RSI) is a difficult but significant task with many real-time applications. Due to its potential in different sectors, including traffic management, urban planning, environmental monitoring, and defense, the detection of vehicles from RS data, such as aerial or satellite imagery, has received greater emphasis. Machine learning (ML), especially deep learning (DL), has proven to be effective in vehicle detection tasks. A convolutional neural network (CNN) is widely utilized to detect vehicles and automatically learn features from the input images. This study develops the Improved Deep Learning-Based Vehicle Detection for Urban Applications using Remote Sensing Imagery (IDLVD-UARSI) technique. The major aim of the IDLVD-UARSI method emphasizes the recognition and classification of vehicle targets on RSI using a hyperparameter-tuned DL model. To achieve this, the IDLVD-UARSI algorithm utilizes an improved RefineDet model for the vehicle detection and classification process. Once the vehicles are detected, the classification process takes place using the convolutional autoencoder (CAE) model. Finally, a Quantum-Based Dwarf Mongoose Optimization (QDMO) algorithm is applied to ensure an optimal hyperparameter tuning process, demonstrating the novelty of the work. The simulation results of the IDLVD-UARSI technique are obtained on a benchmark vehicle database. The simulation values indicate that the IDLVD-UARSI technique outperforms the other recent DL models, with maximum accuracy of 97.89% and 98.69% on the VEDAI and ISPRS Potsdam databases, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. Exploiting Remote Sensing Imagery for Vehicle Detection and Classification Using an Artificial Intelligence Technique.
- Author
-
Alajmi, Masoud, Alamro, Hayam, Al-Mutiri, Fuad, Aljebreen, Mohammed, Othman, Kamal M., and Sayed, Ahmed
- Subjects
- *
REMOTE sensing , *ARTIFICIAL intelligence , *OPTIMIZATION algorithms , *SURFACE of the earth , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Remote sensing imagery involves capturing and examining details about the Earth's surface from a distance, often using satellites, drones, or other aerial platforms. It offers useful data with which to monitor and understand different phenomena on Earth. Vehicle detection and classification play a crucial role in various applications, including traffic monitoring, urban planning, and environmental analysis. Deep learning, specifically convolutional neural networks (CNNs), has revolutionized vehicle detection in remote sensing. This study designs an improved Chimp optimization algorithm with a DL-based vehicle detection and classification (ICOA-DLVDC) technique on RSI. The presented ICOA-DLVDC technique involves two phases: object detection and classification. For vehicle detection, the ICOA-DLVDC technique applies the EfficientDet model. Next, the detected objects can be classified by using the sparse autoencoder (SAE) model. To optimize the SAE's hyperparameters effectively, we introduce an ICOA which streamlines the parameter tuning process, accelerating convergence and enhancing the overall performance of the SAE classifier. An extensive set of experiments has been conducted to highlight the improved vehicle classification outcomes of the ICOA-DLVDC technique. The simulation values demonstrated the remarkable performance of the ICOA-DLVDC approach compared to other recent techniques, with a maximum accuracy of 99.70% and 99.50% on the VEDAI dataset and ISPRS Postdam dataset, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. 利用深度可变形轮廓模型进行结构体裂纹视觉检测.
- Author
-
赖彬 and 王森
- Subjects
- *
DEEP learning , *PROBLEM solving , *DATA augmentation , *SNAKES , *PIXELS - Abstract
At present, the deep learning instance segmentation method for crack detection mainly generates a boundary box through target detection to segment pixel by pixel mask, which will affect the detection effect of structural crack contour, and is accompanied by complex post-processing cost. To solve this problem, this study proposes to use deep snake algorithm model of deep deformable contour to identify and detect structural cracks. The robustness of the model is improved by data enhancement of the structural crack data set. At the same time, the pre training network parameters on the large image data set coco are transferred to the structural crack segmentation model as initialization by transfer learning. The experimental results on the self-made crack image data set show that the trained model can correctly identify the crack object and complete the segmentation of multiple crack targets at the same time. On the premise of the average detection time of 0.12 s, the AP50 reaches 75.4%. The comparison between the proposed methool and other deep learning models and edge detection algorithms also reflect the advantages of Deep Snake algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Hybrid harris hawk-arithmetic optimization with deep learning-driven object detection and classification for surveillance video analysis
- Author
-
Saikrishnan, V. and Karthikeyan, M.
- Published
- 2024
- Full Text
- View/download PDF
20. Multi-Source Domain Fusion Cross-Domain Pedestrian Recognition Based on High-Quality Intermediate Domains
- Author
-
Yixing Niu, Wansheng Cheng, Yushan Lai, Hongzhi Zhang, Mingrui Cao, Kang Cao, and Song Fan
- Subjects
Domain adaptation ,object detector ,transfer learning ,deep learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Pedestrian detection has received considerable attention over the last few years because it can be combined with pedestrian tracking and re-identification in areas such as vehicle-assisted driving and intelligent video surveillance. Although existing pedestrian detection techniques have achieved excellent results, problems such as domain gaps lead to poor generalization performance in these techniques, thereby limiting its application and practical value. This study proposed a high-quality integration domain framework for pedestrian recognition. First, the source domains are produced as super-resolution training data. The HCycleGAN model uses super-resolution algorithms and a generative framework to generate high-quality intermediate domains. Second, a multi-source domain fusion scheme based on the NPIQE module is proposed to improve the generated framework’s quality and reduce overfitting of the dataset. It fuses images in three aspects: similarity, blurriness and unsupervised image quality score values. Finally, we use an anchor-free center and scale prediction model for pedestrian detection. The experimental dataset contained two common pedestrian detection datasets, Caltech and CityPersons. Cross-domain experimental results show that the framework can reduce cross-domain detection miss rate from CityPersons to Caltech by 6.3% and from Caltech to CityPersons by 4.4%. The training of CityPersons in Caltech achieves almost the same detection accuracy as that of the Caltech original domain. In conclusion, the framework presented in this study is effective for cross-domain pedestrian detection and can provide ideas and inspiration for future practical applications.
- Published
- 2023
- Full Text
- View/download PDF
21. Hyperspectral Object Detection Using Bioinspired Jellyfish Search Optimizer With Deep Learning
- Author
-
Hany Mahgoub, Amani Abdulrahman Albraikan, Kamal M. Othman, Ahmed S. Salama, Ishfaq Yaseen, and Sara Saadeldeen Ibrahim
- Subjects
Remote sensing ,hyperspectral imaging ,deep learning ,object detector ,jellyfish search optimizer ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Hyperspectral images (HSI) provide a rich source of data for remote sensing applications, offering extensive spectral data about the Earth’s surface. Object detection in HSI remains a challenging process with various application areas in environmental monitoring, agriculture, and geospatial analysis. The development of deep learning (DL) models for HSI object detection paves the way for new opportunities in advanced remote sensing analysis. DL models enable the automated and reliable detection of target objects. Particularly, convolutional neural networks (CNNs) can handle the high-dimensional nature of hyperspectral data and efficiently learn complex relationships among spectral patterns and object classes. This results in improved detection performance and reduces the need for manual feature engineering. Therefore, this study presents a Hyperspectral Object Detection using Bioinspired Jellyfish Search Optimizer with Deep Learning (HSOD-JSODL) technique for Enhanced Remote Sensing Analysis. The aim of the HSOD-JSODL method lies in the effectual recognition of interested objects in the HSI using the DL model. To achieve this, the HSOD-JSODL technique employs EfficientDet object detector to recognize various kinds of objects in the HSI. EfficientDet is a recently developed object detector which integrates efficiency via a compound scaling approach and efficient network design. For the classification of detected objects, the HSOD-JSODL technique uses a deep belief network (DBN) classifier model. To improve the object classification results of the DBN model, the JSO algorithm is applied as a hyperparameter optimizer. The simulation analysis of the HSOD-JSODL technique is examined on the HSI dataset, and the outcomes are examined under various measures. The simulation values portrayed the betterment of the HSOD-JSODL technique over compared methods.
- Published
- 2023
- Full Text
- View/download PDF
22. Advanced System for Enhancing Location Identification through Human Pose and Object Detection.
- Author
-
Kevin, Medrano A., Crespo, Jonathan, Gomez, Javier, and Alfaro, César
- Subjects
OBJECT recognition (Computer vision) ,POSE estimation (Computer vision) ,MACHINE learning ,MOBILE robots ,COMPUTER vision - Abstract
Location identification is a fundamental aspect of advanced mobile robot navigation systems, as it enables establishing meaningful connections between objects, spaces, and actions. Understanding human actions and accurately recognizing their corresponding poses play pivotal roles in this context. In this paper, we present an observation-based approach that seamlessly integrates object detection algorithms, human pose detection, and machine learning techniques to effectively learn and recognize human actions in household settings. Our method entails training machine learning models to identify the common actions, utilizing a dataset derived from the interaction between human pose and object detection. To validate our approach, we assess its effectiveness using a diverse dataset encompassing typical household actions. The results demonstrate a significant improvement over existing techniques, with our method achieving an accuracy of over 95% in classifying eight different actions within household environments.. Furthermore, we ascertain the robustness of our approach through rigorous testing in real-world environments, demonstrating its ability to perform well despite the various challenges of data collection in such settings. The implications of our method for robotic applications are significant, as a comprehensive understanding of human actions is essential for tasks such as semantic navigation. Moreover, our findings unveil promising opportunities for future research, as our approach can be extended to learn and recognize a wide range of other human actions. This perspective, which highlights the potential leverage of these techniques, provides an encouraging path for future investigations in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Mobile Robot Navigation Based on Embedded Computer Vision.
- Author
-
Marroquín, Alberto, Garcia, Gonzalo, Fabregas, Ernesto, Aranda-Escolástico, Ernesto, and Farias, Gonzalo
- Subjects
- *
MOBILE robots , *OBJECT recognition (Computer vision) , *TRAFFIC monitoring , *TRAFFIC signs & signals , *COMPUTER vision , *ROBOTS , *PYTHON programming language , *ELECTRONIC systems - Abstract
The current computational advance allows the development of technological solutions using tools, such as mobile robots and programmable electronic systems. We present a design that integrates the Khepera IV mobile robot with an NVIDIA Jetson Xavier NX board. This system executes an algorithm for navigation control based on computer vision and the use of a model for object detection. Among the functionalities that this integration adds to the Khepera IV in generating guided driving are trajectory tracking for safe navigation and the detection of traffic signs for decision-making. We built a robotic platform to test the system in real time. We also compared it with a digital model of the Khepera IV in the CoppeliaSim simulator. The navigation control results show significant improvements over previous works. This is evident in both the maximum navigation speed and the hit rate of the traffic sign detection system. We also analyzed the navigation control, which achieved an average success rate of 93 % . The architecture allows testing new control techniques or algorithms based on Python, facilitating future improvements. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Efficient Decoder-Free Object Detection with Transformers
- Author
-
Chen, Peixian, Zhang, Mengdan, Shen, Yunhang, Sheng, Kekai, Gao, Yuting, Sun, Xing, Li, Ke, Shen, Chunhua, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
25. Exploring Practical Deep Learning Approaches for English-to-Hindi Image Caption Translation Using Transformers and Object Detectors
- Author
-
Bisht, Paritosh, Solanki, Arun, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Unhelker, Bhuvan, editor, Pandey, Hari Mohan, editor, and Raj, Gaurav, editor
- Published
- 2022
- Full Text
- View/download PDF
26. Multi-Object Multi-Camera Tracking Based on Deep Learning for Intelligent Transportation: A Review.
- Author
-
Fei, Lunlin and Han, Bing
- Subjects
- *
DEEP learning , *TRAFFIC safety , *PUBLIC safety - Abstract
Multi-Objective Multi-Camera Tracking (MOMCT) is aimed at locating and identifying multiple objects from video captured by multiple cameras. With the advancement of technology in recent years, it has received a lot of attention from researchers in applications such as intelligent transportation, public safety and self-driving driving technology. As a result, a large number of excellent research results have emerged in the field of MOMCT. To facilitate the rapid development of intelligent transportation, researchers need to keep abreast of the latest research and current challenges in related field. Therefore, this paper provide a comprehensive review of multi-object multi-camera tracking based on deep learning for intelligent transportation. Specifically, we first introduce the main object detectors for MOMCT in detail. Secondly, we give an in-depth analysis of deep learning based MOMCT and evaluate advanced methods through visualisation. Thirdly, we summarize the popular benchmark data sets and metrics to provide quantitative and comprehensive comparisons. Finally, we point out the challenges faced by MOMCT in intelligent transportation and present practical suggestions for the future direction. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Simplification of Deep Neural Network-Based Object Detector for Real-Time Edge Computing.
- Author
-
Choi, Kyoungtaek, Wi, Seong Min, Jung, Ho Gi, and Suhr, Jae Kyu
- Subjects
- *
REAL-time computing , *EDGE computing , *DETECTORS , *SYSTEMS on a chip , *COMPUTATIONAL complexity , *KHAT - Abstract
This paper presents a method for simplifying and quantizing a deep neural network (DNN)-based object detector to embed it into a real-time edge device. For network simplification, this paper compares five methods for applying channel pruning to a residual block because special care must be taken regarding the number of channels when summing two feature maps. Based on the comparison in terms of detection performance, parameter number, computational complexity, and processing time, this paper discovers the most satisfying method on the edge device. For network quantization, this paper compares post-training quantization (PTQ) and quantization-aware training (QAT) using two datasets with different detection difficulties. This comparison shows that both approaches are recommended in the case of the easy-to-detect dataset, but QAT is preferable in the case of the difficult-to-detect dataset. Through experiments, this paper shows that the proposed method can effectively embed the DNN-based object detector into an edge device equipped with Qualcomm's QCS605 System-on-Chip (SoC), while achieving a real-time operation with more than 10 frames per second. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Analysis of Movement and Activities of Handball Players Using Deep Neural Networks.
- Author
-
Host, Kristina, Pobar, Miran, and Ivasic-Kos, Marina
- Subjects
ARTIFICIAL neural networks ,HANDBALL players ,CONVOLUTIONAL neural networks ,COMPUTER vision ,TEAM sports - Abstract
This paper focuses on image and video content analysis of handball scenes and applying deep learning methods for detecting and tracking the players and recognizing their activities. Handball is a team sport of two teams played indoors with the ball with well-defined goals and rules. The game is dynamic, with fourteen players moving quickly throughout the field in different directions, changing positions and roles from defensive to offensive, and performing different techniques and actions. Such dynamic team sports present challenging and demanding scenarios for both the object detector and the tracking algorithms and other computer vision tasks, such as action recognition and localization, with much room for improvement of existing algorithms. The aim of the paper is to explore the computer vision-based solutions for recognizing player actions that can be applied in unconstrained handball scenes with no additional sensors and with modest requirements, allowing a broader adoption of computer vision applications in both professional and amateur settings. This paper presents semi-manual creation of custom handball action dataset based on automatic player detection and tracking, and models for handball action recognition and localization using Inflated 3D Networks (I3D). For the task of player and ball detection, different configurations of You Only Look Once (YOLO) and Mask Region-Based Convolutional Neural Network (Mask R-CNN) models fine-tuned on custom handball datasets are compared to original YOLOv7 model to select the best detector that will be used for tracking-by-detection algorithms. For the player tracking, DeepSORT and Bag of tricks for SORT (BoT SORT) algorithms with Mask R-CNN and YOLO detectors were tested and compared. For the task of action recognition, I3D multi-class model and ensemble of binary I3D models are trained with different input frame lengths and frame selection strategies, and the best solution is proposed for handball action recognition. The obtained action recognition models perform well on the test set with nine handball action classes, with average F1 measures of 0.69 and 0.75 for ensemble and multi-class classifiers, respectively. They can be used to index handball videos to facilitate retrieval automatically. Finally, some open issues, challenges in applying deep learning methods in such a dynamic sports environment, and direction for future development will be discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. DeoT: an end-to-end encoder-only Transformer object detector.
- Author
-
Ding, Tonghe, Feng, Kaili, Wei, Yanjun, Han, Yu, and Li, Tianping
- Abstract
At present, with the rapid development of Transformer in object detection tasks, the object detection performance has been significantly improved. However, Transformer-based object detectors generally suffer from high complexity and slow learning convergence, and there is still a certain gap in performance compared to some convolutional neural network (CNN)-based object detectors. Therefore, to improve the existing problems of Transformer in object detection framework and make its detector performance reach the state-of-the-art level, this paper proposes an end-to-end encoder-only Transformer object detector, called DeoT. First, we design a feature pyramid fusion module (FPFM) to generate fusion features with rich semantic information. The proposal of the FPFM not only improves the detection accuracy of objects, but also solves the detection problem of objects of different sizes. Second, we propose an encoder-only Transformer module (E-OTM) to achieve a global representation of features by exploiting deformable multi-head self-attention (DMHSA). Furthermore, we design a Transformer block residual structure (TBRS) in the E-OTM, which refines the output features of the transformer module by using the channel attention and spatial attention in the channel refinement module (CRM) and spatial refinement module (SRM). The proposal of encoder-only Transformer module not only effectively alleviates the complexity and learning convergence problems of the model, but also improves the detection accuracy. We conduct sufficient experiments on the MS COCO object detection dataset and Cityscapes object detection dataset, and achieve 50.9 AP with 34 Epochs on the COCO 2017 tes-dev set, 30.1 AP with 38 FPS on the Cityscapes dataset. Therefore, DeoT not only achieves high efficiency in the training phase, but also ensures real time and accuracy in the detection process. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Helmet Detection Based on an Enhanced YOLO Method
- Author
-
Zheng, Weizhou, Chang, Jiayi, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Wang, Wei, editor, Mu, Jiasong, editor, Liu, Xin, editor, Na, Zhenyu, editor, and Cai, Xiantao, editor
- Published
- 2021
- Full Text
- View/download PDF
31. Native Monkey Detection Using Deep Convolution Neural Network
- Author
-
Kumar, Pankaj, Shingala, Mitalee, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Hassanien, Aboul Ella, editor, Bhatnagar, Roheet, editor, and Darwish, Ashraf, editor
- Published
- 2021
- Full Text
- View/download PDF
32. Analysis of TEM micrographs with deep learning reveals APOE genotype-specific associations between HDL particle diameter and Alzheimer's dementia.
- Author
-
Zheng JJ, Hong BV, Agus JK, Tang X, Guo F, Lebrilla CB, Maezawa I, Jin LW, Vreeland WN, Ripple DC, and Zivkovic AM
- Subjects
- Humans, Female, Male, Aged, Aged, 80 and over, Alzheimer Disease genetics, Alzheimer Disease blood, Alzheimer Disease pathology, Lipoproteins, HDL blood, Lipoproteins, HDL genetics, Genotype, Apolipoproteins E genetics, Deep Learning, Particle Size
- Abstract
High-density lipoprotein (HDL) particle diameter distribution is informative in the diagnosis of many conditions, including Alzheimer's disease (AD). However, obtaining an accurate HDL size measurement is challenging. We demonstrated the utility of measuring the diameter of more than 1,800,000 HDL particles with the deep learning model YOLOv7 (you only look once) from micrographs of 183 HDL samples, including patients with dementia or normal cognition (controls). This method was shown to be more efficient and accurate than conventional image analysis software. Using this method, we found a higher abundance of small HDLs in participants with dementia compared to controls in patients with the apolipoprotein E (APOE) ε3ε4 genotype, whereas patients with the APOE ε3ε3 genotype had higher variability in the abundance of different HDL subclasses. Our results show an example of accurate individual HDL particle diameter measurement for large-scale clinical samples, which can be expanded to characterize the relationship between disease risk and other nanoparticles in the sub-20-nm diameter size range., Competing Interests: Declaration of interests The authors declare no competing interests., (Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.)
- Published
- 2025
- Full Text
- View/download PDF
33. DyCo: Dynamic, Contextualized AI Models.
- Author
-
YI YANG, SANKARADAS, MURUGAN, and CHAKRADHAR, SRIMAT
- Subjects
ARTIFICIAL intelligence ,SUPERVISED learning ,VEHICLE models ,DEEP learning ,DISTILLATION - Abstract
Deviceswith limited computing resources use smallerAI models to achieve low-latency inferencing. However, model accuracy is typically much lower than the accuracy of a bigger model that is trained and deployed in places where the computing resources are relatively abundant. We describe DyCo, a novel system that ensures privacy of stream data and dynamically improves the accuracy of small models used in devices. Unlike knowledge distillation or federated learning, DyCo treats AI models as black boxes. DyCo uses a semi-supervised approach to leverage existing training frameworks and network model architectures to periodically train contextualized, smaller models for resource-constrained devices. DyCo uses a bigger, highly accurate model in the edge-cloud to auto-label data received from each sensor stream. Training in the edge-cloud (as opposed to the public cloud) ensures data privacy, and bespoke models for thousands of live data streams can be designed in parallel by using multiple edge-clouds. DyCo uses the auto-labeled data to periodically re-train, streamspecific, bespoke small models. To reduce the periodic training costs, DyCo uses different policies that are based on stride, accuracy, and confidence information. We evaluate our system, and the contextualized models, by using two object detection models for vehicles and people, and two datasets (a public benchmark and another real-world proprietary dataset). Our results show that DyCo increases the mAP accuracy measure of small models by an average of 16.3% (and up to 20%) for the public benchmark and an average of 19.0% (and up to 64.9%) for the real-world dataset. DyCo also decreases the training costs for contextualized models by more than an order of magnitude. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Cervix type detection using a self‐supervision boosted object detection technique.
- Author
-
Bijoy, M. B., Akondi, Sai Manoj, Abdul Fathaah, S., Raut, Akash, Pournami, P. N., and Jayaraj, P. B.
- Subjects
- *
CERVICAL cancer , *PAP test , *OROPHARYNX , *NETWORK performance , *CANCER patients ,DEVELOPING countries - Abstract
Cervical cancer accounts for a large number of fatalities among cancer patients. It is ranked fourth in the total cancer patients and total number of deaths due to cancer. Developing countries account for 70% of the cases and 90% of the fatalities. Contemporary techniques used for screening cervical cancer are PAP smear test and HPV DNA test. Today there are treatments that can successfully prevent cervical cancer if detected at an early stage. Understanding the cervix type is very important for treatment; computational methods can help us classify the cervix type from cervical images. In this study, we propose an ROI proposal network EfficientCenterDet and a self‐supervision boosted training trick that improves the performance of the network with relatively less labeled data. We use 6114 unlabeled images to perform a pretraining task and 1166 labeled images to retrain the ROI proposal network. The proposed model matches the state‐of‐the‐art IOU of FasterRCNN on the ISIC skin lesions dataset while using one‐third of the number of parameters used in FasterRCNN. On MobileODT cervical data, our self‐supervision boosted model achieves 0.632 IOU, a 10% boost over the state‐of‐the‐art FasterRCNN. Introducing an ensembled EfficientNet B4, the cervix type classification stage achieved an accuracy of 87%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Advanced System for Enhancing Location Identification through Human Pose and Object Detection
- Author
-
Medrano A. Kevin, Jonathan Crespo, Javier Gomez, and César Alfaro
- Subjects
computer vision ,semantic navigation ,machine learning ,human pose ,object detector ,algorithms ,Mechanical engineering and machinery ,TJ1-1570 - Abstract
Location identification is a fundamental aspect of advanced mobile robot navigation systems, as it enables establishing meaningful connections between objects, spaces, and actions. Understanding human actions and accurately recognizing their corresponding poses play pivotal roles in this context. In this paper, we present an observation-based approach that seamlessly integrates object detection algorithms, human pose detection, and machine learning techniques to effectively learn and recognize human actions in household settings. Our method entails training machine learning models to identify the common actions, utilizing a dataset derived from the interaction between human pose and object detection. To validate our approach, we assess its effectiveness using a diverse dataset encompassing typical household actions. The results demonstrate a significant improvement over existing techniques, with our method achieving an accuracy of over 95% in classifying eight different actions within household environments.. Furthermore, we ascertain the robustness of our approach through rigorous testing in real-world environments, demonstrating its ability to perform well despite the various challenges of data collection in such settings. The implications of our method for robotic applications are significant, as a comprehensive understanding of human actions is essential for tasks such as semantic navigation. Moreover, our findings unveil promising opportunities for future research, as our approach can be extended to learn and recognize a wide range of other human actions. This perspective, which highlights the potential leverage of these techniques, provides an encouraging path for future investigations in this field.
- Published
- 2023
- Full Text
- View/download PDF
36. Automated anomaly detection of catenary split pins using unsupervised learning.
- Author
-
Wu, Yunpeng, Meng, Fanteng, Qin, Yong, Qian, Yu, Liu, Zhenliang, and Zhao, Weigang
- Subjects
- *
GENERATIVE adversarial networks , *BUILDING sites , *STRUCTURAL stability , *CATENARY , *DETECTORS - Abstract
Split pins (SPs) are essential for maintaining the structural stability of catenary support devices (CSDs) in high-speed railroads. Excitation and vibration induced by pantograph-catenary interactions would cause SP deterioration, including but not limited to, loosening, breaking, or missing SPs. Current supervised SP inspection systems struggle to meet expectations regarding general anomaly detection. This paper presents an efficient SP inspection system based on unsupervised learning. First, a lightweight and fast object detector is designed and combined with an incremental training strategy to sequentially localize the CSD joints and SPs. Second, an unsupervised autoencoder equipped with a perceptual loss, termed as CSGAN (catenary-style generative adversarial network), is developed to accomplish the encoder-decoder process for SP reconstruction. Finally, an anomaly judgment index is integrated into this system for general SP anomaly indication. Extensive ablation and comparison experiments show the proposed approach surpasses existing state-of-the-art models in accuracy and inference speed. • Developed a cost-effective automated split pin inspection method during track construction and rehabilitation. • The proposed inspection method is based on a fast and lightweight object detector and an incremental training strategy. • Introduced a new unsupervised autoencoder named CSGAN (catenary-style generative adversarial network). • Defined a new anomaly judgment index based on mean absolute error (MAE) and structural similarity index measure (SSIE). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Person Detection in Thermal Videos Using YOLO
- Author
-
Ivasic-Kos, Marina, Kristo, Mate, Pobar, Miran, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Bi, Yaxin, editor, Bhatia, Rahul, editor, and Kapoor, Supriya, editor
- Published
- 2020
- Full Text
- View/download PDF
38. Machine Learning Agricultural Application Based on the Secure Edge Computing Platform
- Author
-
Fan, Wu, Xu, Zhuoqun, Liu, Huanghe, Zongwei, Zhu, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chen, Xiaofeng, editor, Yan, Hongyang, editor, Yan, Qiben, editor, and Zhang, Xiangliang, editor
- Published
- 2020
- Full Text
- View/download PDF
39. Crowdsourcing-Based Indoor Semantic Map Construction and Localization Using Graph Optimization.
- Author
-
Li, Chao, Chai, Wennan, Yang, Xiaohui, and Li, Qingdang
- Subjects
- *
LOCALIZATION (Mathematics) , *SHOPPING malls , *MULTISENSOR data fusion - Abstract
The advancement of smartphones with multiple built-in sensors facilitates the development of crowdsourcing-based indoor map construction and localization. This paper proposes a crowdsourcing-based indoor semantic map construction and localization method using graph optimization. Using waypoints, semantic landmarks, and Wi-Fi landmarks as nodes and the relevance between waypoints and landmarks (i.e., waypoint–waypoint, waypoint–semantic, waypoint–Wi-Fi, semantic–semantic, and Wi-Fi–Wi-Fi) as edges, the optimization graph is constructed. Initializing the venue map is the single-track semantic map with the highest quality, as determined by a proposed map quality evaluation function. The aligned venue and candidate maps are optimized while satisfying the constraints, with the candidate map exhibiting the highest degree of similarity to the venue map. The lightweight venue map is then updated in terms of waypoint and landmark attributes, as well as the relationship between waypoints and landmarks. To determine a pedestrian's location on a venue map, similarities between a local map and a venue map are evaluated. Experiments conducted in an office building and shopping mall scenes demonstrate that crowdsourcing-based venue maps are superior to single-track semantic maps. Additionally, the landmark matching-based localization method can achieve a mean localization error of less than 0.5 m on the venue map, compared to 0.6 m in a single-track semantic map. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Light weight object detector based on composite attention residual network and boundary location loss.
- Author
-
Xiao, Zehao, Dong, Enzeng, Tong, Jigang, Zhu, Lin, and Wang, Zenghui
- Subjects
- *
DEEP learning , *DETECTORS , *DATA integrity - Abstract
The object detector based on deep learning has received extensive attention, but the high computational cost has become an obstacle to its large-scale application. It is a great challenge for object detection to further reduce the hardware requirements on the premise of ensuring high detection accuracy. We propose a one-stage lightweight object detector and a new regression loss. In this method, ResNet is improved and combined with attention mechanism to ensure the maximum integrity of feature information with fewer parameters; The multi-scale feature fusion network is improved to reduce the reasoning complexity of the structure. In addition, the bounding box regression loss is improved, and the specific position of the bounding box is adjusted by considering the balance of multiple factors in the regression process. The experimental results show that: 1) the combination of most detectors and improved loss can further improve the performance of detectors; 2) As a whole, our improved network and loss can give consideration to both speed and accuracy on Pascal VOC and COCO; 3) in the addition of other new training tricks such as DropBlock and Mosaic, we can achieve better overall performance on the coco test development set, 38.42 AP (average accuracy) at 40.3 FPS. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. 基于决策的目标检测器黑盒对抗攻击方法.
- Author
-
付平, 郭玲, 刘冰, 朱玉晴, and 凤雷
- Abstract
Copyright of Computer Measurement & Control is the property of Magazine Agency of Computer Measurement & Control and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
42. A CNN-based Approach for Cable-Suspended Load Lifting with an Autonomous MAV.
- Author
-
Lopez, Manuel and Martinez-Carranza, Jose
- Abstract
The popularity of Micro Aerial Vehicles (MAV) to be used in civilian applications has increased in the last years. However, in most of these applications, a MAV is used to acquire aerial images and video of areas and structures of interest. However, MAVs could become more useful if they can interact with the environment. For instance, in a parcel delivery task, the goal is for the MAV to deliver a package somewhere, but what about having to pick up a package autonomously? This task raises some challenges: i) the MAV has to recognize where the package or object of interest is; ii) the MAV has to plan its maneuver to achieve the picking. In this paper, we address both challenges, considering the scenario where the MAV has a suspended cable that moves freely with a hook attached at the end of the cable. A suspended cable saves weight, although it has to be indirectly controllable with the MAV’s flight. Thus, we present a solution based on a Convolutional Neural Network that is trained to recognize the object of interest, in this case, a bucket; and that simultaneously recognizes the hook. Both objects are expected to be observed with a camera on board the MAV. Our method uses the distance between these two objects in a state machine controller to position the MAV and trigger the lifting maneuver in a single upward motion action that reduces the effects of air current on the hook. We use synthetic datasets to train the bucket and hook detector, but the model is capable of performing the detection in real environments. We achieved an average lifting success rate of 70% for indoor and 60% for outdoor scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. High-Precision Binary Object Detector Based on a BSF-XNOR Convolutional Layer
- Author
-
Shaobo Wang, Cheng Zhang, Di Su, Longlong Wang, and Huan Jiang
- Subjects
Binary network ,object detector ,embedded device ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Recently, building an efficient and robust model for object detection has attracted the attention of the vision community. Although binary networks have a fast inference speed, they cannot be used directly on mobile devices such as unmanned aerial vehicles (UAVs) because of their low detection accuracy. Different from improving the detection accuracy of a binary network by adjusting the network structure or adjusting the update gradient, we propose an improved binary neural network based on the block scaling factor XNOR (BSF-XNOR) convolutional layer. In addition, we propose a two-level densely connected network structure, which further enhances the network layer’s feature representation capabilities. Experiments using the TensorFlow framework prove the effectiveness of our algorithm in improving accuracy. Compared with the original standard XNOR network, the mean average precision (mAP) detected by our algorithm on the PASCAL VOC dataset was improved. The experimental results on the VisDrone2019 UAV dataset confirm that our method achieves a better balance between speed and accuracy than previous methods. Our algorithm aims to guide and deploy high-precision binary networks on the embedded device and solves the problem of low-precision binary networks.
- Published
- 2021
- Full Text
- View/download PDF
44. Zero-Centered Fixed-Point Quantization With Iterative Retraining for Deep Convolutional Neural Network-Based Object Detectors
- Author
-
Sungrae Kim and Hyun Kim
- Subjects
Convolutional neural network ,deep neural network ,fixed-point quantization ,network compression ,object detector ,YOLOv3 ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In the field of object detection, deep learning has greatly improved accuracy compared to previous algorithms and has been used widely in recent years. However, object detection using deep learning requires many hardware (HW) resources due to the huge computations for high performance, making it very difficult to run real-time on embedded platforms. Therefore, various compression methods have been studied to solve this problem. In particular, quantization methods greatly reduce the computational burden of deep learning by reducing the number of bits used for weights and activation functions in deep learning. However, most of these existing studies targeted only object classification and cannot be applied to object detection. Furthermore, most of the existing quantization studies are based on floating-point operations, which requires additional effort when implementing HW accelerators. This paper proposes an HW-friendly fixed-point-based quantization method that can also be applied to object detection. In the proposed method, the center of the weight distribution is adjusted to zero by subtracting the mean of weight parameters before quantization, and the retraining process is iteratively applied to minimize the accuracy drop caused by quantization. Furthermore, while applying the proposed method to object detection, performance degradation is minimized by considering the minimum and maximum values of weight parameters of deep learning networks. When applying the proposed quantization method to representative one-stage object detectors, You Only Look Once v3 and v4 (YOLOv3 and YOLOv4), detection accuracy similar to the original networks (i.e., YOLOv3 and YOLOv4) with a single-precision floating-point format (32-bit) is maintained despite expressing weights with only about 20% of the bits compared to a single-precision floating-point format in COCO dataset.
- Published
- 2021
- Full Text
- View/download PDF
45. Automatic Person Detection in Search and Rescue Operations Using Deep CNN Detectors
- Author
-
Sasa Sambolek and Marina Ivasic-Kos
- Subjects
Convolutional neural networks ,object detector ,person detection ,search and rescue operations ,UAV ,YOLO ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Due to a growing number of people who carry out various adrenaline activities or adventure tourism and stay in the mountains and other inaccessible places, there is an increasing need to organize a search and rescue operation (SAR) to provide assistance and health care to the injured. The goal of SAR operation is to search the largest area of the territory in the shortest time possible and find a lost or injured person. Today, drones (UAVs or drones) are increasingly involved in search operations, as they can capture a large, controlled area in a short amount of time. However, a detailed examination of a large amount of recorded material remains a problem. Even for an expert, it is not easy to find searched people who are relatively small considering the area where they are, often sheltered by vegetation or merged with the ground and in unusual positions due to falls, injuries, or exhaustion. Therefore, the automatic detection of persons and objects in images/videos taken by drones in these operations is very significant. In this paper, the reliability of existing state-of-the-art detectors such as Faster R-CNN, YOLOv4, RetinaNet, and Cascade R-CNN on a VisDrone benchmark and custom-made dataset SARD build to simulate rescue scenes was investigated. After training the models on selected datasets, detection results were compared. Because of the high speed and accuracy and the small number of false detections, the YOLOv4 detector was chosen for further examination. YOLOv4 model results related to different network sizes, different detection accuracies, and transfer learning settings were analyzed. The model robustness to weather conditions and motion blur were also investigated. The paper proposes a model that can be used in SAR operations because of the excellent results in detecting people in search and rescue scenarios.
- Published
- 2021
- Full Text
- View/download PDF
46. Dynamic Anchor: A Feature-Guided Anchor Strategy for Object Detection.
- Author
-
Liu, Xing, Chen, Huai-Xin, and Liu, Bi-Yuan
- Subjects
ANCHORS ,DETECTORS - Abstract
The majority of modern object detectors rely on a set of pre-defined anchor boxes, which enhances detection performance dramatically. Nevertheless, the pre-defined anchor strategy suffers some drawbacks, especially the complex hyper-parameters of anchors, seriously affecting detection performance. In this paper, we propose a feature-guided anchor generation method named dynamic anchor. Dynamic anchor mainly includes two structures: the anchor generator and the feature enhancement module. The anchor generator leverages semantic features to predict optimized anchor shapes at the locations where the objects are likely to exist in the feature maps; by converting the predicted shape maps into location offsets, the feature enhancement module uses the high-quality anchors to improve detection performance. Compared with the hand-designed anchor scheme, dynamic anchor discards all pre-defined boxes and avoids complex hyper-parameters. In addition, only one anchor box is predicted for each location, which dramatically reduces calculation. With ResNet-50 and ResNet-101 as the backbone of the one-stage detector RetinaNet, dynamic anchor achieved 2.1 AP and 1.0 AP gains, respectively. The proposed dynamic anchor strategy can be easily integrated into the anchor-based detectors to replace the traditional pre-defined anchor scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Multi-scale global context feature pyramid network for object detector.
- Author
-
Li, Yunhao, Shao, Mingwen, Fan, Bingbing, and Zhang, Wei
- Abstract
In order to capture more contextual information, various attention mechanisms are applied to object detectors. However, the spatial interaction in the commonly used attention mechanisms is single scale, and it cannot capture the context information of the objects from the feature maps of different scales, which will lead to the underutilization of the context information. In addition, since the predicted bounding box does not completely fit the shape and pose of the object, it has room for further improvement in the performance. In this paper, we propose a multi-scale global context feature pyramid network to obtain a feature pyramid with richer context information, which is a two-layer lightweight neck structure. Moreover, we extend the regression branch by adding an additional prediction head to predict the corner offsets of the bounding boxes to further refine the bounding boxes, which can effectively improve the accuracy of the predicted bounding boxes. Extensive experiments are conducted on the MS COCO 2017 detection datasets. Without bells and whistles, the proposed methods show an average 2% improvement over the RetinaNet baseline. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Analysis of Movement and Activities of Handball Players Using Deep Neural Networks
- Author
-
Kristina Host, Miran Pobar, and Marina Ivasic-Kos
- Subjects
sports ,object detector ,object tracking ,action recognition ,video analysis ,YOLO ,Photography ,TR1-1050 ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
This paper focuses on image and video content analysis of handball scenes and applying deep learning methods for detecting and tracking the players and recognizing their activities. Handball is a team sport of two teams played indoors with the ball with well-defined goals and rules. The game is dynamic, with fourteen players moving quickly throughout the field in different directions, changing positions and roles from defensive to offensive, and performing different techniques and actions. Such dynamic team sports present challenging and demanding scenarios for both the object detector and the tracking algorithms and other computer vision tasks, such as action recognition and localization, with much room for improvement of existing algorithms. The aim of the paper is to explore the computer vision-based solutions for recognizing player actions that can be applied in unconstrained handball scenes with no additional sensors and with modest requirements, allowing a broader adoption of computer vision applications in both professional and amateur settings. This paper presents semi-manual creation of custom handball action dataset based on automatic player detection and tracking, and models for handball action recognition and localization using Inflated 3D Networks (I3D). For the task of player and ball detection, different configurations of You Only Look Once (YOLO) and Mask Region-Based Convolutional Neural Network (Mask R-CNN) models fine-tuned on custom handball datasets are compared to original YOLOv7 model to select the best detector that will be used for tracking-by-detection algorithms. For the player tracking, DeepSORT and Bag of tricks for SORT (BoT SORT) algorithms with Mask R-CNN and YOLO detectors were tested and compared. For the task of action recognition, I3D multi-class model and ensemble of binary I3D models are trained with different input frame lengths and frame selection strategies, and the best solution is proposed for handball action recognition. The obtained action recognition models perform well on the test set with nine handball action classes, with average F1 measures of 0.69 and 0.75 for ensemble and multi-class classifiers, respectively. They can be used to index handball videos to facilitate retrieval automatically. Finally, some open issues, challenges in applying deep learning methods in such a dynamic sports environment, and direction for future development will be discussed.
- Published
- 2023
- Full Text
- View/download PDF
49. Convolutional Neural Networks for the Identification of Filaments from Fast Visual Imaging Cameras in Tokamak Reactors
- Author
-
Cannas, Barbara, Carcangiu, Sara, Fanni, Alessandra, Lupelli, Ivan, Militello, Fulvio, Montisci, Augusto, Pisano, Fabio, Sias, Giuliana, Walkden, Nick, Howlett, Robert James, Series Editor, Jain, Lakhmi C., Series Editor, Esposito, Anna, editor, Faundez-Zanuy, Marcos, editor, Morabito, Francesco Carlo, editor, and Pasero, Eros, editor
- Published
- 2019
- Full Text
- View/download PDF
50. Detection of GUI Elements on Sketch Images Using Object Detector Based on Deep Neural Networks
- Author
-
Yun, Young-Sun, Jung, Jinman, Eun, Seongbae, So, Sun-Sup, Heo, Junyoung, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Ruediger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Hwang, Seong Oun, editor, Tan, Syh Yuan, editor, and Bien, Franklin, editor
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.