What is the Softmax function



Machine learning

Release date:2024/9/7         

 ・In Japanese
Prerequisites
 ・Differentiation of Rational Functions
 ・Neural Network


■What is the softmax function

The softmax function is one of the activation functions commonly used in neural networks, and it normalizes the function so that the sum of its values ​​is 1, as shown below.



As a concrete example, the calculation result of y when x is given is shown below.



As you can see, the sum of the calculation results is 1, and the values ​​are normalized. This can be interpreted as the probability of each output occurring, and in the case of a neural network, it is used to calculate the probability of the output occurring by using a softmax function in the output layer as shown below.



■Difference between ratio calculation and softmax function

If we normalize, the calculation method below should be fine, but what is the advantage of using the softmax function?



First, the softmax function can have negative input values. The above formula does not allow negative calculations. The ability to process negative values ​​makes it compatible with neural networks. Second, with an exponential function, the output value increases as the input value increases, as shown below, making it easier to distinguish between small and large input values.

 

■Differentiation of the softmax function

Below we use the derivative of a rational function.



Let's consider the following specific example.



The next example is as follows:











List of related articles



Machine learning