Definition

The method to determine the amount to move along a given search direction (learning rate).

Algorithm

We want to find a learning rate minimizing the cost function $t > 0 argmin f (x + t Δ x)$

Exact line search

The optimal $t$ can be obtained by using grid search in the interval $[0, 1]$ However, it causes substantial computational costs.

Backtracking line search

Identify a value of $α$ that provides a reasonable amount of improvement in the objective function, rather than to find the actual minimizing value $t$

The backtracking line search starts with a large $t$ and iteratively shrinks it until the value is enough to provide a decrease in the objective function.

Calculate the gradient of objective function with respect to learning rate $t$ at the point $t = 0$ $\nabla_{t} f (x + t Δ x) ≃ f (x) + \nabla f (x) t Δ x$

and define a parameter $α \in (0, \frac{1}{2})$ which modify the slope of $f (x) + \nabla f (x) t Δ x$

Now, the line $f (x) + α \nabla f (x) t Δ x$ is used to check whether the decrease in the objective function is enough

Start from large $t$ usually $1$ , and iteratively shrinks it by multiplying $β \in (0, 1)$ to $t$ until the value is lower than the line

If the condition $f (x + t Δ x) < f (x) + α \nabla f (x) t Δ x$ is satisfied, then use $t$ as a learning rate

My Knowledge Base

Explorer

Backtracking Line Search

Definition

Algorithm

Exact line search

Backtracking line search

Graph View

Table of Contents

Backlinks