Accelerating Neural Network Training with FSGQR: A Scalable and High-Performance Alternative to Adam
Published Online: Feb 05, 2025
Page range: 95 - 113
Received: Sep 07, 2024
Accepted: Dec 04, 2024
DOI: https://doi.org/10.2478/jaiscr-2025-0006
Keywords
© 2025 Jarosław Bilski et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This paper introduces a significant advancement in neural network training algorithms through the development of a Fast Scaled Givens rotations in QR decomposition (FSGQR) method based on the recursive least squares (RLS) method. The algorithm represents an optimized variant of existing rotation-based training approaches, distinguished by its complete elimination of scale factors from calculations while maintaining mathematical precision. Through extensive experimentation across multiple benchmarks, including complex tasks like the MNIST digit recognition and concrete strength prediction, FSGQR demonstrates superior performance compared to the widely-used ADAM optimizer and other conventional training methods. The algorithm achieves faster convergence with fewer training epochs while maintaining or improving accuracy.In some tasks, FSGQR completed training in up to five times fewer epochs compared to the ADAM algorithm, while it achieved higher recognition accuracy in the MNIST training set. The paper provides comprehensive mathematical foundations for the optimization, detailed implementation guidelines, and extensive empirical validation across various neural network architectures. The results conclusively demonstrate that FSGQR offers a compelling alternative to current deep learning optimization methods, particularly for applications requiring rapid training convergence without sacrificing accuracy. The algorithm’s effectiveness is particularly noteworthy in feedforward neural networks with differentiable activation functions, making it a valuable tool for modern machine learning applications.