Definition

The method to determine the amount to move along a given search direction (learning rate).

Algorithm

We want to find a learning rate minimizing the cost function

The optimal can be obtained by using grid search in the interval However, it causes substantial computational costs.

Identify a value of that provides a reasonable amount of improvement in the objective function, rather than to find the actual minimizing value

The backtracking line search starts with a large and iteratively shrinks it until the value is enough to provide a decrease in the objective function.

Calculate the gradient of objective function with respect to learning rate at the point

and define a parameter which modify the slope of

Now, the line is used to check whether the decrease in the objective function is enough

Start from large usually , and iteratively shrinks it by multiplying to until the value is lower than the line

If the condition is satisfied, then use as a learning rate