Brunton Textbook Review - Linear Control Theory

2023/10/08

This essay continues my review of Steve Brunton’s textbook, “Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control.”.

When talking about control, we’ll mostly discuss closed-loop feedback control, but it’s worth briefly emphasizing that there are actually other kinds of control which are sometimes appropriate. For example, most traffic lights are purely on timers, with no sensing capabilities at all. This sort of nonreactive, pre-programmed control is called open loop. However, you might change the light timings and flow of traffic in a city close to a major sporting event. This sort of control, where you make changes to what is still essentially an open-loop control law based on exogenous changes to the system, is called disturbance feedforward control.

OK, that’s enough of other kinds of control systems, let’s get to the good stuff:

Closed-Loop Feedback Control, LTI

Closed-loop control is only necessary due to uncertainty: if we were able to predict the outputs of the controller ahead of time, we could use that to make an open-loop controller instead. However, we typically get uncertainty from a combination of two sources: model uncertainty ($w_{d}$) and measurement noise ($w_{n}$). We can think of these things as inputs to the system. Of course, there’s also a reference trajectory, which we’ll denote $w_{r}$.

In closed-loop feedback control, we have a system function and a measurement function:

$$\dot{x} = f(x, u, w_{d})$$ $$y = g(x, u, w_{n})$$

And we wish to construct a control law $u = k(y, w_{r})$ that minimizes a cost function $J(x, u, w_{r})$. In practice, however, $k$ is typically a dynamical system of some sort rather than a pure function (allowing state to propagate through time). This is true even if we do something simple like LQR control using a Kalman filter based estimation of the state, since KF is a type of dynamical system.

Since, as we’ve noted previously, we only know how to solve linear equations, we would prefer to have linear models for $\dot{x}$ and $y$:

$$\dot{x} = Ax + Bu$$ $$y = Cx + Du$$

If the matrices $A, B, C, D$ don’t change over time, we call the system LTI, or Linear Time-Invariant. If the actual system and measurement functions are nonlinear, we often just linearize the system about some point $\bar{x}, \bar{y}, \bar{u}$ and do this math in a coordinate space with those points at the origin so that the previous linear models hold.

Controllability and Observability

Analyzing LTI Controllers

Let’s forget about the whole “$k$ is actually a dynamical system” thing and just assume it’s some linear function $u = Kx$. Then $\dot{x} = Ax + Bu = Ax + BKx = (A + BK)x$. Also, we’ll just assume we’re trying to drive the system to $x = 0$, since ability to drive any linear system to zero implies the ability to drive any linear system anywhere (just drive a translated version of your system to zero).

Since $A + BK$ is just a constant matrix, let’s try to reason about how dynamical systems of the form $\dot{x} = Fx$ behave. From the spectral decomposition section of the last chapter, we know that if we do an eigendecomposition $F = T \Lambda T^{-1}$, then the dynamics in the coordinate space $z = T^{-1}x$ will just be $\dot{z} = \Lambda z$, which also means that forward-integrating the dynamics results in the extremely simple $z_{t + \Delta t}^{(i)} = z_{t}^{(i)} e^{\lambda_{i} \Delta t}$. While we do technically need to transform back into the $x = Tz$ coordinate space before the values of $z$ become meaningful, we should be able to see from this that positive real eigenvalues of $F$ result in the state diverging to infinity, while negative real eigenvalues cause it to decay to zero. In an LTI system, we can use this analysis to know ahead of time whether a controller is stable. This also means that if we choose a $K$ such that $A + BK$ has negative eigenvalues, we have a stable controller. If the system is controllable, we’ll see that you can actually put the eigenvalues anywhere in the complex plane by choosing $K$ properly. In practice this is typically done using numerical methods, but there is a method called Ackermann’s formula which can also do it in a sort of closed-form way using the pseudoinverse of the controllability matrix $\mathcal{C}$

It’s fairly trivial to see that no control system of the form $u = Kx$ can achieve zero error in the presence of biased noise, but also that biased noise isn’t very difficult to deal with, so we will assume mean-zero noise for now.

Controllability and Observability

We know how to see if some particular $K$ is stable, but for some LTI systems there is no stable $K$. For example, what if we had:

$$B = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, A = I $$

Clearly, we cannot control the second dimension of $x$ at all! However, we also can’t just check that $B$ is a rank-$n$ matrix, because of cases like:

$$B = \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix}$$

Here, $B$ is a rank-1 matrix. But if we had $A = \begin{bmatrix}1 & 0\\ -1 & 0 \end{bmatrix}$, then after one timestep our effect on the state would be propagated by the system dynamics: $AB = \begin{bmatrix}1 & 2\\ -1 & -2 \end{bmatrix}$. The column space of $\begin{bmatrix}A & AB\end{bmatrix}$ is rank-2! In other words, it’s possible that even if the controller dynamics on their own aren’t full-rank, the system dynamics are such that the system is actually controllable. We might manage to independently control different dimensions of $x$ based on the fact that their dynamics are different. In general, it should be fairly clear that in the worst case it will take $n-1$ steps of the system dynamics to propagate a change to every dimension it can reach. Therefore, we need to check the column space of:

$$\mathcal{C} = \begin{bmatrix}B & AB & A^{2}B & \dots & A^{n-1}B \end{bmatrix}$$

If this space is rank-$n$, then the system is fully controllable. In fact, this column space is also the space of states to which we can drive the system in finite time using control.

Observability can be constructed along parallel lines, though since sensing is not propagated through the dynamics we instead construct:

$$\mathcal{O} = \begin{bmatrix} C \\ CA \\ CA^{2} \\ \dots \\ CA^{n-1} \end{bmatrix}$$

And examine the row space, not the column space.

The PBH Test

The PBH test states that a pair of matrices $(A, B)$ is controllable if and only if the column rank of $\begin{bmatrix}(A - \lambda I) & B \end{bmatrix} = n$ for all $\lambda$. Since the definition of the eigenvalue tells us that the rank of $A - \lambda I = n$ unless $\lambda$ is an eigenvalue of $A$, this means we only need to perform the PBH test for eigenvalues of $A$. Furthermore, the definition of the eigenvalue tells us that if $A - \lambda I$ is rank-deficient, the null space is spanned by the eigenvectors associated with $\lambda$. Therefore, $B$ must span the same space as the set of all eigenvectors of $A$ - or, in other words, the columns of $B$ must have some component in each of the eigenvector directions of $A$. This also tells us that we only need multiple control inputs in $B$ when the eigenvalues of $A$ are degenerate (associated with multiple eigenvectors, aka multiplicity > 1).

Relating Controllability and Reachability: Cayley-Hamilton

The Cayley-Hamilton Theorem states that any matrix satisfies its own characteristic equation. This is actually a very simple statement, which I overthought for a long time after I heard it until I worked an example. The characteristic equation of a matrix is $det(A - \lambda I) = 0$. If you remember how to find the determinant, it should be pretty easy to see that $det(A - \lambda I)$ is a degree-$n$ polynomial, where $A$ is $n \times n$. Since we’re setting it equal to zero, we can divide it so that the leading coefficient is one, putting it in a standard form of:

$$\lambda^n + a_{n-1}\lambda^{n-1} + \dots + a_1 \lambda + a_0 = 0$$ The Cayley-Hamilton theorem simply says that $\lambda = A$ satisfies this equation. This, however, is quite powerful! For example, it means that we can express $A^n$ as a sum of smaller powers of $A$:

$$A^n = -(a_{n-1}A^{n-1} + \dots + a_1 A + a_0)$$

Right-multiply by $A$, and you can see $A^{n+1}$, and in fact any power $A^{k \geq n}$ can be expressed as a sum of powers of $A^{{0..n-1}}$. And the very neat thing we can do with this is change from the infinite-sum Taylor definition of the matrix exponential:

$$e^{At} = I + At + \dfrac{A^2t^2}{2!} + \dots$$ To a finite sum:

$$e^{At} = \beta_0(t)I + \beta_1(t)A + \beta_2(t)A^2 + \dots + \beta_{n-1}(t)A^{n-1}$$ With a bit of clever math based on using this trick to expand the dynamics of a controlled system:

$$\xi = \int_0^t e^{A(t - \tau)}Bu(\tau) d\tau$$ We can prove that a controllable system does indeed admit the creation of an actuation function $u(\tau)$ that can drive to arbitrary points in $\mathbb{R}^n$.

Degrees of Controllability/Observability

We can obtain continuous measurements of how easy it is to control or estimate a state using the concept of a Gramian. In linear algebra, a Gramian is a matrix of inner products from a set of (usually) vectors ${x_1 .. x_i}$:

$$G_{ij} = \langle x_i, x_j \rangle$$

We wish to apply this concept to find the controllability of a matrix on some time horizon. For the sake of simplicity, we’ll start with a discrete-time approach: on a time horizon $\Delta t$, the response of a system to a unit control impulse can be represented by a matrix-valued function called the impulse response: $\Phi(t) = e^{A \Delta t}B$, an $n \times m$ matrix showing how each dimension of the state is affected by an impulse in each dimension of the control input. However, $\Phi(\Delta t)$ isn’t quite what we’re interested in: the geometric interpretation of the impulse response matrix is that each row tells you how much that dimension of the state responds to an impulse in each dimension of the control. We want a matrix that sort of accumulates this: if we apply a finite amount of control energy, how far can we get in each dimension of the state space, and how does energy “leak” from one state to another through some combination of the system and control dynamics?

This is actually quite similar to the concept of finding the row-wise autocorrelation of the impulse response matrix. If a row of the impulse response matrix tells you how some dimension of the state responds to an impulse in each dimension of the control, then the inner product of two rows tells you how much an impulse calculated to maximize change in one dimension would also change another dimension (if you have finite energy to deliver and wish to maximize change in one dimension, you should deliver it as a control vector parallel to the corresponding row of the impulse response matrix). The row-wise autocorrelation of some $X$ can be found by computing $XX^$, so we arrive at $e^{A \Delta t}B B^ * e^{A^ \Delta t}$, an $n \times n$ matrix.

Yet we’re still not quite done, because $\Phi(\Delta t)$ only tells us the directions a unit impulse will send us in at time zero, whereas we want a representation of the space we can reach if we can continuously control the system at all times. For this, we need to integrate our correlation matrix over a time horizon instead of using the discrete-time approximation with $\Delta t$:

$$W_c(t) = \langle\Phi(t), \Phi(t) \rangle = \int_0^t e^{A \tau}B B^* e^{A^* \tau} d \tau$$

$W_c(t)$ is called the “controllability Gramian,” and we typically evaluate this integral to infinity, so unless otherwise noted I’ll use $W_c = \lim_{t \rightarrow \infty} W_c(t)$.

Though it may not look like it, this is actually the Gramian of the impulse response matrix. As the formula above suggests, we can extend the idea of a Gramian beyond the original “set of vectors” formulation to apply to a vector-valued function using our friend from the Fourier transform section, the Hermitian inner product. In fact, the Hermitian inner product shows up in another context closely related to the Gramian: the vaguely-defined “energy” we discussed above can actually be found by taking the Hermitian inner product of the control function with itself over the interval of interest. Use of the Hermitian inner product of a function with itself to find energy appears surprisingly often in physical systems - for example, the energy/work done on/by an electrical charge is proportional to the Hermitian inner product of the voltage function with itself. In all three of these examples, the Hermitian inner product serves as a sort of “accumulator of instantaneous response” which results in a physically meaningful “energy.”

While the controllability Gramian tells us how much energy is required to manipulate individual dimensions of the state space, its eigenvectors represent semi-axes of an ellipsoid that tells us which directions are easiest to travel in if we integrate our control impulses over time. The eigenvector diagonalization is often easier to work with than the controllability Gramian itself. However, $W_c$ is still useful in a number of ways:

The observability Gramian may be used analogously to the controllability Gramian, but as we are interested in the column-wise autocorrelations we integrate over $(C e^{A t})* C e^{A t}$ instead.

Stabilizability and Detectability

We may not actually need full controllability of a system - sometimes, stabilizability will do. Stabilizability means that the unstable eigenvector directions of the system dynamics matrix $A$ are in the span of the controllability matrix $\mathcal{C}$, such that even if the system is not fully controllable we can still use a controller to dampen instabilities.

End Part 1

As this review is 2400 words and so far only covers about half of the controls chapter, I’ve decided to split it into two parts. The next part will cover LQR and its relationship with the Kalman filter, Linear-Quadratic Gaussians (LQGs), robust control and frequency-domain techniques, and possibly also work an example.