Sigmoid Activation function
Activation functions are the functions which decide whether the specific neuron should give output or not.
As we can see the input of the activation function is coming from summation function.
The output of the summation function is the dot product of xi’s and wi’s.
Output of summation function=w1*x1+w2*x2+w3*x3+w4*x4
This output is given to the activation function let’s say it is ‘f(x)’ . The output of the whole neuron is f(output of weights).
The basic requirement of Activation function is the function should be differentiable and it should not be too complex to differentiate.
Some one may get the doubt why the activation function should be differentiable?
Ans: Because to train multi layer perceptron we use back propagation method. In backpropagation method we differentiate the activation function to get the optimal weights.
In 1980’s and 1990’s computer scientists came up 2 simple activation methods
- sigmoid activation function
- tanh activation function
Sigmoid activation function:
In the above graph sigmoid is continuous over Real numbers and in the graph we see that there is no steep curves anywhere hence it is differentiable over the real numbers.
You may see that sigmoid function looks so difficult to do differentiation. But actually it is not that difficult to differentiate.
Let f(x) be the sigmoid function.
Once we write the sigmoid function we can use it while we are doing forward propagation of weights and in the backward propagation while assigning the weights.
The other activation function tanh will be discussed later.