Accelerating Neural Network Training with FSGQR: A Scalable and High-Performance Alternative to Adam

This paper introduces a significant advancement in neural network training algorithms through the development of a Fast Scaled Givens rotations in QR decomposition (FSGQR) method based on the recursive least squares (RLS) method. The algorithm represents an optimized variant of existing rotation-based training approaches, distinguished by its complete elimination of scale factors from calculations while maintaining mathematical precision. Through extensive experimentation across multiple benchmarks, including complex tasks like the MNIST digit recognition and concrete strength prediction, FSGQR demonstrates superior performance compared to the widely-used ADAM optimizer and other conventional training methods. The algorithm achieves faster convergence with fewer training epochs while maintaining or improving accuracy.In some tasks, FSGQR completed training in up to five times fewer epochs compared to the ADAM algorithm, while it achieved higher recognition accuracy in the MNIST training set. The paper provides comprehensive mathematical foundations for the optimization, detailed implementation guidelines, and extensive empirical validation across various neural network architectures. The results conclusively demonstrate that FSGQR offers a compelling alternative to current deep learning optimization methods, particularly for applications requiring rapid training convergence without sacrificing accuracy. The algorithm’s effectiveness is particularly noteworthy in feedforward neural networks with differentiable activation functions, making it a valuable tool for modern machine learning applications.

Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Computer Sciences, Artificial Intelligence, Databases and Data Mining

Journal RSS Feed

Accelerating Neural Network Training with FSGQR: A Scalable and High-Performance Alternative to Adam

Jarosław Bilski

Bartosz Kowalczyk

Ludmila Dymova

Min Xiao

Published Online: Feb 05, 2025

Page range: 95 - 113

Received: Sep 07, 2024

Accepted: Dec 04, 2024

DOI: https://doi.org/10.2478/jaiscr-2025-0006

Keywordsneural network training algorithm, QR decomposition, scaled Givens rotations, approximation, classification

© 2025 Jarosław Bilski et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
neural network training algorithm, QR decomposition, scaled Givens rotations, approximation, classification