Using gradient descent to an optimisation algorithm that uses the optimal value of parameters (coefficients) for a differentiable function | [International Journal of Communication Networks and Information Security • 2023]

Author(s):

1. Falah Amer Abdulazeez: University of Anbar, Ramadi city - Al Anbar Governorate, Iraq

2. Abdul Sttar Ismail: University of Anbar, Ramadi city - Al Anbar Governorate, Iraq

3. Rafid S. Abdulaziz: University of Anbar, Ramadi city - Al Anbar Governorate, Iraq

Abstract:

Deep neural networks (DNN) are commonly employed. Deep networks' many parameters require extensive training. Complex optimisers with multiple hyperparameters speed up network training and increase generalisation. Complex optimiser hyperparameter tuning is generally trial-and-error. In this study, we visually assess the distinct contributions of training samples to a parameter update. Adaptive stochastic gradient descent is a batch stochastic gradient descent variation for neural networks using ReLU in hidden layers (aSGD). It involves the mean effective gradient as the genuine slope for boundary changes, in contrast to earlier procedures. Experiments on MNIST show that aSGD speeds up DNN optimisation and improves accuracy without added hyperparameters. Experiments on synthetic datasets demonstrate it can locate redundant nodes, which helps model compression.

Page(s): 24-36

DOI: DOI not available

Published: Journal: International Journal of Communication Networks and Information Security, Volume: 15, Issue: 1, Year: 2023

Keywords:

Gradient Descent , Optimisation Algorithm , Deep Network Optimisation , Adaptive Gradient Descent , Batch Size

References:

References are not available for this document.

Citations

Citations are not available for this document.

Citations

Downloads

Views