1. Studies on neural networks : information propagation at initialisation and robustness to adversarial examples
- Author
-
Ughi, Giuseppe and Tanner, Jared
- Subjects
Deep learning (Machine learning) ,Derivative free optimisation ,Random matrices - Abstract
Over the last decade, the academic and industrial communities have become increasingly involved in the field of Deep Learning, leading these algorithms to become the drivers of the AI revolution that has allowed "machines" to attain increasingly extraordinary results. However, despite the impressive results achieved, there is still room for these algorithms to improve, and the use of mathematical techniques is central to this advancement. In this thesis we focus on two different themes in this space: the initialisation and the robustness of neural networks. Currently the training of these algorithms is becoming an always more expensive procedure, not only financially, but also ecologically. To speed-up this process and therefore alleviate the above problems, one can focus on trying to more optimally initialise the neural networks. In this thesis we focus on defining this optimality with measures taken from information theory. Specifically, we introduce a lower-bound on the mutual information of the neural networks and find that there is an initialisation procedure that allows signals to propagate through the neural networks with less loss of information. Finally, neural networks are vulnerable to adversarial attacks, meaning that regular inputs with small perturbations, although appearing to be insignificant to humans, can cause undesired model behaviours. In particular, we focus on identifying the methods that generate these adversarial perturbations, considering the neural networks as black-boxes. To do this, we consider the state-of-the-art methods and compare them for the first time in the same exact settings. We also introduce a further method to ensure that all the most common derivativefree-optimisation techniques have been tested on this problem. Thanks to our experiments, we are able to show how there is a class of methods that is particularly effective at generating these adversarial perturbations, but also how the various methods are affected differently by the setting in which they are deployed, therefore indicating that there is not a one-size-fits-all solution.
- Published
- 2022