While trying to learn about the linear quadratic regulator (LQR) controller, I came across UC Berkeley’s course on deep reinforcement learning. Sadly, their lecture slides on model-based planning (Lec. 10 in the 2020 offering of CS285) are riddled with typos, equations cutoff from the slides, and dense notation. This post presents my own derivations of the LQR controller for discrete-time finite-horizon time-varying systems.
Given an undirected graph \(G = (V, E)\), a common task is to identify clusters among the nodes. It is a well-known fact that the sign of entries in the second eigenvector of the normalized Graph Laplacian matrix provides a convenient way to partition the graph into two clusters; this “spectral clustering” method has strong theoretical foundations. In this post, I highlight several theoretical works that generalize the technique for \(k\)-way clustering.
The upgrade experience to MathJax 3 was far from smooth.
In this post, I explore how terminals display color, a two-stage process involving ANSI escape codes and user-defined color schemes.
I derive the bias-variance decomposition of mean squared error for both estimators and predictors, and I show how they are related for linear models.