A study on smooth activation functions

Biswas, Koushik

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/510699

Title:	A study on smooth activation functions
Researcher:	Biswas, Koushik
Guide(s):	Pandey, Ashish Kumar and Banerjee, Shilpak
Keywords:	Computer Science Computer Science Artificial Intelligence Engineering and Technology
University:	Indraprastha Institute of Information Technology, Delhi (IIIT-Delhi)
Completed Date:	2023
Abstract:	Artificial neural networks (ANNs) have occupied the centre stage in deep learning. An activation function is a crucial component in the neural network, which introduces the non-linearity in the network. An activation function is considered good if it can generalise better on a variety of datasets, ensure faster convergence and improve neural network performance. The Rectified Linear Unit (ReLU) has emerged as the most popular activation function due to its simplicity though it has some drawbacks. To overcome the shortcomings of ReLU (non-smooth, non-zero mean, negative missing, to name a few), and to increase the accuracy considerably in a variety of tasks, many new activation functions have been proposed over the years like Leaky ReLU, ELU, Softplus, Parametric ReLU, ReLU6 etc. However, all of them provides marginal improvement over ReLU. Swish, GELU, Padé activation unit (PAU), and Mish are some non-linear smooth activations proposed recently which show good improvement over ReLU in a variety of deep learning tasks. ReLU or its variants are non-smooth (continuous but not differentiable) at the origin though smoothness is an important property during backpropagation. We construct several smooth activation functions, which are approximation by a smooth function of ReLU, Leaky ReLU or its variants. Some of these functions are hand-engineered, while some come from underline mathematical theory. All these functions have shown good improvement over ReLU or Swish in the variety of standard datasets in different deep learning problems like image classification, object detection, semantic segmentation, and machine translation. newline
Pagination:	178 p.
URI:	http://hdl.handle.net/10603/510699
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	53.08 kB	Adobe PDF	View/Open
02_prelim pages.pdf		379.18 kB	Adobe PDF	View/Open
03_content.pdf		52.75 kB	Adobe PDF	View/Open
04_abstract.pdf		45.02 kB	Adobe PDF	View/Open
05_chapter 1.pdf		317.25 kB	Adobe PDF	View/Open
06_chapter 2.pdf		62.55 kB	Adobe PDF	View/Open
07_chapter 3.pdf		3.43 MB	Adobe PDF	View/Open
08_chapter 4.pdf		6.32 MB	Adobe PDF	View/Open
09_chapter 5.pdf		9.94 MB	Adobe PDF	View/Open
10_annexures.pdf		216.73 kB	Adobe PDF	View/Open
11_chapter 6.pdf		5.83 MB	Adobe PDF	View/Open
12_chapter 7.pdf		4.82 MB	Adobe PDF	View/Open
13_chapter 8.pdf		2.66 MB	Adobe PDF	View/Open
14_chapter 9.pdf		897.26 kB	Adobe PDF	View/Open
80_recommendation.pdf		159.32 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET