1. Practical processing and acceleration of graph neural networks
- Author
-
Tailor, Shyam and Lane, Nicholas
- Subjects
computer science ,machine learning ,neural networks ,pruning ,quantization ,neural architecture design ,computer security - Abstract
We have witnessed a dramatic surge in interest in Machine Learning (ML) in the past decade. Deep neural networks have achieved, or surpassed, human-level on diverse tasks ranging from image classification to game playing. In these applications, we typically observe that the input to the model has some form of regular structure: for example, images are a 2D grid. More recently, there has been interest in expanding the successes of the ML revolution to data without uniform structure, such as graphs. Graphs, consisting of a set of nodes, and a set of edges defining relations between the nodes, offer tremendous flexibility for modelling. As a result, we see these models applied to problems ranging from code analysis to recommender systems to drug discovery, achieving state-of-the-art performance and unlocking new applications for ML. With the proven potential of graph neural networks (GNNs) and the vast space of possible applications, it is natural to turn our attention towards practical issues that arise when we aim to deploy these models beyond a research context. One primary concern is efficiency: how do we design GNNs that consume fewer resources, such as time and memory, to scale our training to larger models and datasets and deploy our models to more resource-constrained devices? Moreover, once we release these models into the wild, how do we ensure they can withstand attacks from potential adversaries? These are the questions that motivate the work in this thesis: what novel techniques are necessary to make a step towards tackling these efficiency and security issues? A recurring theme in this thesis is that the loss of regular structure introduces several unique challenges to GNNs: techniques that may work for other common neural network architectures do not necessarily apply to GNNs. The thesis begins by rigorously evaluating two hardware-software co-design techniques that are popular for other neural network architectures: quantisation, where we use low-precision arithmetic at inference time, and pruning, where we remove weights from the network. Next, we investigate efficient architecture design, first for general-purpose GNNs, and secondly for models specifically designed for processing point cloud data. Finally, the thesis describes a new type of security vulnerability associated with these models and discusses potential mitigations.
- Published
- 2022
- Full Text
- View/download PDF