TY - GEN
T1 - Enhancing Deep Neural Network Convergence and Performance
T2 - 2nd International Conference on Informatics, ICI 2023
AU - Maurya, Ritesh
AU - Aggarwal, Divyam
AU - Gopalakrishnan, T.
AU - Pandey, Nageshwar Nath
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Activation functions play an important role in Deep Neural Networks. The activation function can learn nonlinearities present in the data; therefore, it can learn intricate patterns present in the data. Rectified Linear Unit (ReLU) is an activation function that helps in encountering the problem of vanishing gradient. However, it suffers from 'dying ReLU' problem for the negative values. Leaky ReLU can solve the problem of 'dying ReLU'; though it still suffers from a vanishing gradient problem due to the small gradient at for negative values, which results in slow convergence. Therefore, in this work, a combination of ReLU and Exponential Linear Unit (ELU) has been proposed considering the smoother convergence of the ELU activation function for the values on the negative side. Evaluating the effectiveness of the developed hybrid activation function compared to previous ReLU activation function versions such as SeLU, Leaky ReLU, ELU, etc. using a toy multi-layer perceptron and convolution neural network (CNN) model on the FashinMNIST and MNIST datasets. The improvement in the performance of these toy models when used with the proposed hybrid activation function on given datasets suggests the effectiveness of the proposed hybrid activation function.
AB - Activation functions play an important role in Deep Neural Networks. The activation function can learn nonlinearities present in the data; therefore, it can learn intricate patterns present in the data. Rectified Linear Unit (ReLU) is an activation function that helps in encountering the problem of vanishing gradient. However, it suffers from 'dying ReLU' problem for the negative values. Leaky ReLU can solve the problem of 'dying ReLU'; though it still suffers from a vanishing gradient problem due to the small gradient at for negative values, which results in slow convergence. Therefore, in this work, a combination of ReLU and Exponential Linear Unit (ELU) has been proposed considering the smoother convergence of the ELU activation function for the values on the negative side. Evaluating the effectiveness of the developed hybrid activation function compared to previous ReLU activation function versions such as SeLU, Leaky ReLU, ELU, etc. using a toy multi-layer perceptron and convolution neural network (CNN) model on the FashinMNIST and MNIST datasets. The improvement in the performance of these toy models when used with the proposed hybrid activation function on given datasets suggests the effectiveness of the proposed hybrid activation function.
UR - https://www.scopus.com/pages/publications/85186120236
UR - https://www.scopus.com/pages/publications/85186120236#tab=citedBy
U2 - 10.1109/ICI60088.2023.10421353
DO - 10.1109/ICI60088.2023.10421353
M3 - Conference contribution
AN - SCOPUS:85186120236
T3 - Proceedings of 2023 2nd International Conference on Informatics, ICI 2023
BT - Proceedings of 2023 2nd International Conference on Informatics, ICI 2023
A2 - Singh, Sandeep Kumar
A2 - Saxena, Vikas
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 November 2023 through 25 November 2023
ER -