Sigmoid

The sigmoid, defined as $f(x)&space;=&space;\frac{1}{1&space;+&space;e^{-x}}$ , is a non-linear function that suffers from saturation.

Saturation of activation

An activation that has an almost zero gradient at certain regions. This is an undesirable property since it results in slow learning.
image source

Tanh

This non-linearity squashes a real-valued number to the range $[-1,&space;1]$ . Like the sigmoid neuron, its activations saturate, but unlike the sigmoid neuron its output is zero-centered.
image source

ReLu

The most popular non-linearity in modern deep learning, partly due to its non-saturating nature, defined as $f(x)&space;=&space;\max(x,0)$ .
image source

A possible fix to the dead filter problem is to define ReLU with a small slope in the negative part, i.e., $f(x)&space;=&space;\left\{\begin{array}{lr}&space;ax,&space;&&space;\text{for&space;}&space;x<0\\&space;x,&space;&&space;x&space;\geq&space;0&space;\end{array}\right\}$ .