Retrieving "Learning Rate" from the archives
Cross-reference notes under review
While the archivists retrieve your requested volume, browse these clippings from nearby entries.
-
Minimum
Linked via "learning rate"
The most elementary algorithm for finding local minima in differentiable functions is the Gradient Descent method. Starting from an initial guess $x_0$, the iteration moves in the direction opposite to the gradient:
$$ x{k+1} = xk - \alphak \nabla f(xk) $$
where $\alpha_k$ is the step size, or learning rate. The effectiveness of Gradient Descent is highly dependent on the [curvature](/entries/cu… -
Predictive Coding
Linked via "learning rate"
\hat{\mathbf{s}}{t+1} = \hat{\mathbf{s}}t + \eta \nabla{\hat{\mathbf{s}}} \log P(\mathbf{x}t | \hat{\mathbf{s}}t) \cdot \epsilont
$$
where $\eta$ is the learning rate, and $\nabla{\hat{\mathbf{s}}} \log P(\mathbf{x}t | \hat{\mathbf{s}}_t)$ represents the prediction error gradient [1].
Cortical Implementation: The Ascending and Descending Flow