P03 - Additively Preconditioned Trust Region Strategies for Machine Learning
Description
In our work we adopt a novel variant of the “Additively Preconditioned Trust-Region Strategy” (APTS) to train neural networks (NNs). APTS is based on a right preconditioned Trust-Region (TR) method, which utilizes an additive domain-decomposition-based preconditioner. In the context of NN training, the domain is considered to be either the parameters of the NN or the training data set. Based on the TR framework, APTS guarantees global convergence to a minimizer. It also eliminates the necessity for costly hyper-parameter tuning, since the TR algorithm automatically determines the step size in every iteration. The presented numerical study includes a comparison with widely used training methods such as SGD, Adam, LBFGS, and the standard TR method, where we demonstrate the capabilities, strengths, and limitations of the proposed training methods.