Start Over

DCE-YOLOv8: Lightweight and Accurate Object Detection for Drone Vision

Authors :: Jinsu An
Dong Hee Lee
Muhamad Dwisnanto Putro
Byeong Woo Kim
Source :: IEEE Access, Vol 12, Pp 170898-170912 (2024)
Publication Year :: 2024
Publisher :: IEEE, 2024.
Abstract: Object detection using drones is a sophisticated technology that employs a camera mounted on a drone in conjunction with a computer vision algorithm to pinpoint the precise location of an object and ascertain its type. Drones are capable of rapidly scanning extensive areas, thereby facilitating efficient data collection and analysis. This capability can yield critical information and bolster swift response efforts. The utilization of object detection technology in drones offers numerous advantages. Nevertheless, despite the benefit of drones’ ability to swiftly scan wide areas, several challenges persist, including image resolution, the detection of small-sized objects, overlapping objects, and concentrated distributions. In this paper, we introduce DCE-YOLOv8, an advanced model based on YOLOv8. DCE-YOLOv8 is engineered to address the low detection rate of small objects in drone imagery. To effectively detect small objects, it is imperative to either enhance the resolution of drone images or efficiently extract the features of small objects. Additionally, the efficient integration of these extracted features is crucial. The ERB(Efficient Residual Bottleneck) and DCE(Divided Context Extraction) modules are incorporated into the Backbone, with the ERB module reducing the number of parameters to render the model more lightweight. The DCE module focuses on extracting features pertinent to small objects. Subsequently, the rate of missed detections is mitigated by comprehensively merging the shallow and deep features extracted from the neck part. The proposed method is trained using the VisDrone, demonstrating superior detection performance compared to other state-of-the-art methods. When comparing the proposed method with the YOLOv8 small version using the VisDrone dataset, the mean Average Precision value improved by approximately 43%, increasing from 22.8mAP to 32.7mAP, while the number of parameters decreased by about 57%, from 11,166,560 to 4,822,382. The Average Inference Time per Image has been optimized to 11.4 ms, which is relatively slower than YOLOv8’s 5.9 ms, yet it still maintains a robust frame rate of 87.71 FPS, emphasizing its potential for real-time detection applications. Furthermore, the proposed method underwent additional experiments using the TT100K and AFO datasets. Compared to YOLOv8 small, it demonstrates superior performance while maintaining a comparable average inference time. This paper holds significant value in balancing object detection rates and real-time operational speed, serving as a guiding reference for in-depth research in related fields.