1. Formal convergence analysis on deterministic [formula omitted]-regularization based mini-batch learning for RBF networks.
- Author
-
Liu, Zhaofeng, Leung, Chi-Sing, and So, Hing Cheung
- Subjects
- *
ITERATIVE learning control , *ARTIFICIAL neural networks , *RADIAL basis functions , *NONLINEAR regression , *SMOOTHNESS of functions , *DETERMINISTIC algorithms - Abstract
Conventional convergence analysis on mini-batch learning is usually based on the stochastic gradient concept, in which we assume that the training data are presented in a random order. Also, some convergence results require that the learning rate should decrease with the number of training cycles, and that the objective function is a smooth function. Practically speaking, a deterministic presentation scheme with a fixed learning rate is more preferable. Hence, there is a gap between theoretical results and actual implementation. This paper aims at filling the gap. We use the radial basis function (RBF) model for nonlinear regression problems as an example to analyze the convergence properties of mini-batch learning. This paper considers a nonsmooth objective function, which consists of three terms. The coexistence of these three terms is able to handle a number of situations. The first term is a conventional training set error. The second term is a quadratic term which is used to suppress the effect of imperfections in the implementation. The last term is an ℓ 1 -norm term which is used to select important RBF nodes for the resultant network. Note that the ℓ 1 -norm term is a nonsmooth function. Although a nonsmooth ℓ 1 -norm is included and the mini-batch algorithm is deterministic, we are still able to derive the convergence properties, including the sufficient conditions for convergence and range of learning rate. With our results, we have a better theoretical understanding on the behaviour of mini-batch learning and obtain some guidelines on choosing the learning rate. The analysis results can be extended to other flat structural neural network models and other objective functions, which are with quadratic terms and ℓ 1 -norm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF