Definition

The activation function of a node in an artificial neural network is a function that calculates the output of the node based on the linear combination of its inputs. It is used to add a non-linearity to the model.

Examples

Logistic Function

Definition

The logistic function is inverse function of Logit.

Facts

Sigmoid activation function is vulnerable to vanishing gradient problem. The image of the derivative of the sigmoid function is . For this reason, after passing node with sigmoid Activation Function, the gradient is decreased

Also, with the sigmoid Activation Function, if all the inputs are positive, then all the gradients also positive.

Link to original

Hyperbolic Tangent Function

Definition

Link to original

Rectified Linear Unit Function

Definition

Facts

If an initial value is negative, it is never updated.

Link to original

ReLU6

Definition

Link to original

Gaussian-Error Linear Unit

Definition

GELU is a smooth approximation of ReLU.

where is the CDF of the standard normal distribution.

Link to original

Parametric ReLU

Definition

where is a hyperparameter

Facts

If , it is called a Leaky ReLU

Link to original

Exponential Linear Unit

Definition

where is a hyperparameter

Link to original

Swish Function

Definition

where is Sigmoid Function, and is a hyperparameter

When , the function is called a sigmoid liniear unit (SiLU).

Link to original