Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training

Authors

  • Kehelwala Dewage Gayan Maduranga Tennessee Technological University

DOI:

https://doi.org/10.32473/flairs.38.1.138986

Abstract

Optimizing neural networks often encounters challenges such as saddle points, plateaus, and ill-conditioned curvature, limiting the effectiveness of standard optimizers like Adam, Nadam, and RMSProp. To address these limitations, we propose the Curvature-Adaptive Learning Rate (CALR) optimizer, a novel method that leverages local curvature estimates to dynamically adjust learning rates. CALR, along with its variants incorporating gradient clipping and cosine annealing schedules, offers enhanced robustness and faster convergence across diverse optimization tasks. Theoretical analysis confirms CALR’s convergence properties, while empirical evaluations on benchmark functions—Rosenbrock, Himmelblau, and Saddle Point—highlight its efficiency in complex optimization landscapes. Furthermore, CALR demonstrates superior performance on neural network training tasks using MNIST and CIFAR-10 datasets, achieving faster convergence, lower loss, and better generalization compared to traditional optimizers. These results establish CALR as a promising optimization strategy for challenging neural network training problems.

Downloads

Published

14-05-2025

How to Cite

Maduranga, K. D. G. (2025). Curvature-Adaptive Learning Rate Optimizer: Theoretical Insights and Empirical Evaluation on Neural Network Training. The International FLAIRS Conference Proceedings, 38(1). https://doi.org/10.32473/flairs.38.1.138986

Issue

Section

Special Track: Neural Networks and Data Mining