1. Multiple object tracking using feature fusion in hierarchical LSTMs
- Author
-
Ehtesham Hassan
- Subjects
image sequences ,object tracking ,object detection ,image representation ,video signal processing ,image segmentation ,image motion analysis ,recurrent neural nets ,learning (artificial intelligence) ,feature extraction ,track modelling ,motion coding scheme ,relative position ,motion representation ,hierarchical lstm structure ,track association ,multiple object tracking challenge datasets ,feature fusion ,hierarchical lstm ,intelligent video applications ,recurrent neural networks ,complex temporal dynamics ,online tracking ,active tracks ,tracking-by-detection methodology ,hierarchical long short term memory network structure ,motion dynamics ,motion cues ,bounding boxes ,object instance segments ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Multiple object tracking sets the foundation for many intelligent video applications. The authors present a novel tracking solution using the ability of recurrent neural networks to effectively model complex temporal dynamics between objects irrespective of appearances, pose, occlusions, and illumination. For online tracking, a real-time and accurate association of objects with active tracks poses the major algorithmic challenge. Additionally, re-entry of objects should also be correctly resolved. They follow tracking-by-detection methodology using hierarchical long short term memory (LSTM) network structure for modelling the motion dynamics between objects by learning the fusion of appearance and motion cues. Existing works capture object's perspective for tracking within the detected bounding boxes. They also incorporate object instance segments for track modelling by applying the maskRCNN detector. They present a novel motion coding scheme that anchors the LSTM structure to effectively model the motion and relative position between objects in a single representation scheme. The proposed motion representation and deep features representing objects appearances are fused in an embedded space learned by the hierarchical LSTM structure for predicting the object to track association. The authors present experimental validation of the proposed approach on multiple object tracking challenge datasets and demonstrate that their solution naturally deals with major tracking challenges under all uncertainties.
- Published
- 2020
- Full Text
- View/download PDF