Retrieving "Returns" from the archives

Cross-reference notes under review

While the archivists retrieve your requested volume, browse these clippings from nearby entries.

  1. Debt Obligations

    Linked via "returns"

    The process of bundling numerous individual debt obligations (e.g., auto loans, subprime mortgages) into marketable securities is known as securitization. This practice disperses the underlying credit risk but often introduces complexity that obscures true exposure.
    Securitization relies on tranching, where the pooled assets are carved into segments—tranches—with differing [priori…
  2. Proximal Policy Optimization

    Linked via "returns"

    Advantage Estimation and Value Function Learning
    Like many policy gradient methods, PPO requires an estimate of the advantage function $\hat{A}t$. This is typically accomplished by combining the value function estimate $V(st; \phi)$ with the empirical returns $R_t$.
    The generalized advantage estimator (GAE)/) is commonly employed:
  3. Proximal Policy Optimization

    Linked via "returns"

    Where $\deltat = rt + \gamma V(s{t+1}; \phi) - V(st; \phi)$ is the temporal difference (TD) error, $\gamma$ is the discount factor, and $\lambda$ controls the bias-variance trade-off in the advantage estimate.
    The value function$V(st; \phi)$ is learned concurrently with the policy by minimizing a squared error loss against the returns collected from the trajectories:
    $$…