Back to Search
Start Over
Self-Supervised Representation Learning from Temporal Ordering of Automated Driving Sequences
- Publication Year :
- 2023
-
Abstract
- Self-supervised feature learning enables perception systems to benefit from the vast raw data recorded by vehicle fleets worldwide. While video-level self-supervised learning approaches have shown strong generalizability on classification tasks, the potential to learn dense representations from sequential data has been relatively unexplored. In this work, we propose TempO, a temporal ordering pretext task for pre-training region-level feature representations for perception tasks. We embed each frame by an unordered set of proposal feature vectors, a representation that is natural for object detection or tracking systems, and formulate the sequential ordering by predicting frame transition probabilities in a transformer-based multi-frame architecture whose complexity scales less than quadratic with respect to the sequence length. Extensive evaluations on the BDD100K, nuImages, and MOT17 datasets show that our TempO pre-training approach outperforms single-frame self-supervised learning methods as well as supervised transfer learning initialization strategies, achieving an improvement of +0.7% in mAP for object detection and +2.0% in the HOTA score for multi-object tracking.<br />Comment: 12 pages, 7 figures
- Subjects :
- Computer Science - Computer Vision and Pattern Recognition
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2302.09043
- Document Type :
- Working Paper