Start Over

State space representation and phase analysis of gradient descent optimizers.

Authors :: Yao, Biyuan
Li, Guiqing
Wu, Wei
Source :: SCIENCE CHINA Information Sciences; Apr2023, Vol. 66 Issue 4, p1-15, 15p
Publication Year :: 2023
Abstract: Deep learning has achieved good results in the field of image recognition due to the key role of the optimizer in a deep learning network. In this work, the optimizers of dynamical system models are established, and the influence of parameter adjustments on the dynamic performance of the system is proposed. This is a useful supplement to the theoretical control models of optimizers. First, the system control model is derived based on the iterative formula of the optimizer, the optimizer model is expressed by differential equations, and the control equation of the optimizer is established. Second, based on the system control model of the optimizer, the phase trajectory process of the optimizer model and the influence of different hyperparameters on the system performance of the learning model are analyzed. Finally, controllers with different optimizers and different hyperparameters are used to classify the MNIST and CIFAR-10 datasets to verify the effects of different optimizers on the model learning performance and compare them with related methods. Experimental results show that selecting appropriate optimizers can accelerate the convergence speed of the model and improve the accuracy of model recognition. Furthermore, the convergence speed and performance of the stochastic gradient descent (SGD) optimizer are better than those of the stochastic gradient descent-momentum (SGD-M) and Nesterov accelerated gradient (NAG) optimizers. [ABSTRACT FROM AUTHOR]