For example, suppose we have a function of the following form.
When searching for the minimum value of this function, the solution when it is differentiated is 0,
but if the formula is complex and you can differentiate it, but you cannot find x for which f'(x)=0, or even if you don't know the form of the function,
one method for searching for the minimum value of the function is the gradient method.
This is a method that is widely used in neural networks, etc.
The idea behind the gradient method is to first calculate the derivative of a function.
The derivative is the slope of the tangent to the function, and if the slope of the tangent is positive, the minimum value is on the left side, and if the slope of the tangent is negative, the minimum value is on the right side.
Therefore, by moving the measurement point several times, the point where the slope of the tangent is sufficiently small is the minimum value.
As mentioned above, the gradient method is used when the solution to a differential equation cannot be found, and differentiation is performed here.
What this means is that finding the solution to a differential equation and differentiating are separate things, and differentiation itself can sometimes be done.
Even if you can't even differentiate, the difference between the solutions when two x's are inserted into the function is the differential value, so this is how you find it.
What has been explained so far can be expressed as a general formula as follows:
α is the reflection coefficient, which determines the amount by which x is moved.
Even if you can determine which side the minimum value is on from the slope of the tangent line, you don't know how far away the minimum value is from there, so you have no choice but to move the measurement point several times. However, to reduce the number of measurements as much as possible, there is a method to adjust the reflection coefficient depending on the slope of the tangent line.
(The larger the slope of the tangent line, the larger the reflection coefficient.)
■A note about gradient methods
①When moving the measurement point, if the moving distance is too large, the minimum value will be exceeded and convergence may not occur.
② Also, even if it appears to have converged to the minimum value, it may not be the minimum value when you look at the whole solution as shown below. This type of solution is called a local solution.
There seems to be no perfect solution to these problems, but it is better than no answer, and gradient methods are widely used.
■Examples of gradient methods
We will use the gradient method to find the minimum point in the following function.
This function can be differentiated to find x where the solution is 0, but in this case we will assume that we cannot find the solution. (However, we will assume that we can differentiate it.)
Next, let's arbitrarily choose x=1 as the starting point for the calculation. The calculation result of equation (1) is as follows. Here, the reflection coefficient is set to 0.1.
This allows us to find the value of x that is closest to the minimum value. To get it even closer to the minimum value, we will perform the same calculation with x=0.8. As shown below.
By repeating this calculation, we can eventually find the x value closest to the minimum value (in this case, x = 0).
<If differentiation is difficult or the form of the equation is unknown>
If you know the formula but are having difficulty differentiating it, or if you don't know the formula but can get the answer by inputting values,
you can find the differential value by actually substituting values as shown below. This type of method is called numerical differentiation.
Using equation (2) as an example, we will use numerical differentiation to find the minimum value using the gradient method.
Although equation (2) is easy to differentiate, we will assume that it cannot be differentiated.
First, we will determine appropriate values and find the differential value when x=1 and h=0.1.
Substitute the above results into equation (1). Set the reflection coefficient α = 0.1.
Next, find the differential value when x=0.79 and h=0.1.
Substitute the above result into equation (1). Again, set α = 0.1.
By repeating this process, we can find the value of x that results in the minimum value, just as before.