Back to Search Start Over

Generating and Exploiting Deep Learning Variants to Increase Utilization of the Heterogeneous Resources in Autonomous Driving Platforms

Authors :
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Hamid Tabani
Kosmidis, Leonidas
Pujol Torramorell, Roger
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Hamid Tabani
Kosmidis, Leonidas
Pujol Torramorell, Roger
Publication Year :
2020

Abstract

Nowadays, Deep learning-based solutions and, in particular, deep neural networks (DNNs) are getting into several core functionalities in critical real-time embedded systems (CRTES), like those in planes, cars, and satellites, from vision-based perception (object detection and object tracking) systems to trajectory planning. As a result, several deep learning instances are running simultaneously at any time on the same computing platform. However, while modern computing platforms offer a variety of computing elements (e.g., CPUs, GPUs, and specific accelerators) in which those DNN instances can be executed depending on their computational requirements and temporal constraints. Currently, most DNNs are mainly programmed to exploit one particular computing element, regular cores of the GPUs. This lack of variety causes a resource imbalance and under-utilization of the various computing element resources when executing several DNN instances, causing an increase in DNN tasks' execution time requirements. In this Thesis, (a) we develop different variants (implementation) of well-known DNN libraries used in the Apollo Autonomous Driving software for each of the computing elements of the latest NVIDIA Xavier system-on-chip. Each variant is configured to balance resource requirements and performance: the regular CPU core implementation that can run on 2, 4, and 6 cores (always leaving 2 cores free for other computations); the GPU with regular and Tensor cores variants that can run on 4 or 8 GPU's Stream Multiprocessors (SM); and 1 or 2 NVIDIA's Deep Learning Accelerators (NVDLA); (b) we show that each particular variant/configuration offers different resource utilization/performance point. (c) we show how those heterogeneous computing elements can be exploited by a static scheduler to sustain the execution of multiple and diverse DNN variants on the same platform.

Details

Database :
OAIster
Notes :
application/pdf, English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1238020801
Document Type :
Electronic Resource