Welcome to the tenth installment of my review of “Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control” (technically a review of chapter 9, because one chapter got split into two reviews). This review gives a high-level overview of the math behind balancing tranformations.
Suppose you have a typical linear system which you’re observing somehow and potentially also exerting control on, as we’ve been discussing for several chapters:
$$ \begin{align} \dot{x} &= Ax + Bu \\ y &= Cx + Du \end{align} $$ It might be useful to you to find some reduced-order input-output model based on a projection of $x$, given as $x = T \tilde{x}$. This gives rise to a new set of matrices such that:
$$\begin{align} \dot{\tilde{x}} &= \tilde{A}\tilde{x} + \tilde{B}u \\ y &= \tilde{C}\tilde{x} + \tilde{D}u \end{align}$$ The reduced-order model can then be used as a more computationally efficient way to control a very high-dimensional system.
If we have a history of states, we could infer $T$ from that data using the Proper Orthogonal Decomposition (POD), which we discussed briefly when we looked at DMD and is basically just an estimate based on the SVD. The point of the POD is to capture the maximum variance, or energy, within a dataset. This sounds appealing, except that it might not always be useful for purposes of control. This is because, especially in high-dimensional systems, the most controllable directions might not account for very much of the energy. In that case, you’re effectively collapsing the controllable aspects of a system before you try to efficiently control it.
Instead, we’d prefer for our reduced-order system to be both controllable and observable. Even more, we’d like it to be controllable and observable in similar directions. You may remember from the first linear control theory post that the Gramian provides a scalar-valued measure of controllability (specifically, on-diagonal elements indicate the amount of energy required to move along a certain dimension of state space, and off-diagonal elements indicate the level of coupling between dimensions).
Computing a Balanced Transformation
Based on the definition of the controllability and observability Gramians, we can find that under the transformation $x = T \tilde{x}$, the controllability Gramian is $\hat{W}_ {c} = T^{-1}W_ {c}T^{-* }$ and the observability Gramian $\hat{W}_ {o} = T^{*}W_ {o}T$. We’d like to require that the controllability and observability Gramians are equal and diagonal, introducing $\Sigma = \hat{W}_ {c} = \hat{W}_ {o}$. As an aside, you may have noticed that we’re talking about $T^{-1}$ even though the $T$ we discussed earlier ought to be a non-square matrix (it would be $n \times r$, where $r$ is the reduced rank). For now, we’re going to actually compute a square $T$ that doesn’t reduce the rank but instead only transforms into a space where the Gramians are balanced (diagonal and equal). Later, we’ll truncate this $T$ to form the $n \times r$ version we’re expecting.
In order to find, $T$, we’ll first determine $\Sigma$. To find $\Sigma$, first consider the product:
$$\hat{W}_ {c} \hat{W}_ {o} = \Sigma^{2} = T^{-1}W_ {c}W_ {o}T$$ Left multiplying by $T$:
$$T \Sigma^{2} = W_ {c}W_ {o}T$$
This looks extremely similar to the eigendecomposition of $W_ {c}W_ {o}$, where $T$ is the eigenvector matrix and $\Sigma$ (which, as you recall, is diagonal) is the eigenvalue matrix. However, the columns of $T$ might be eigenvectors of $W_ {c} W_ {o}$ scaled by any factor and this expression would still be true for some diagonal $\Sigma$ matrix without necessarily having $\hat{W}_ c = \hat{W}_ o$. Therefore, the next step will be to find a balancing matrix $\Sigma_ {s}$ that correctly scales the columns of the unbalanced eigenvector matrix $T_ {u}$ relative to one another, such that $T = T_ {u} \Sigma_ {s}$ and:
$$(T_ {u} \Sigma_ {s}) \Sigma^{2} = W_ {c} W_ {o} T$$
Next, recall the formulas for $\hat{W}_c$ and $\hat{W}_o$:
$$\begin{align} \hat{W}_ {c} &= T^{-1}W_ {c}T^{-* } \\ \hat{W}_ {o} &= T^{*}W_ {o}T \end{align}$$
If $T = T_ {u} \Sigma_ {s}$, then, noting that diagonal matrices commute, and that the transpose of a diagonal matrix is itself, we have:
$$\begin{align} \hat{W}_ {c} &= \Sigma_ {s}^{-2} T_ {u}^{-1}W_ {c}T_ {u}^{-* } \\ \hat{W}_ {o} &= \Sigma_ {s}^{2} T_ {u}^{*}W_ {o}T_ {u} \end{align}$$
Since these matrices are diagonal, we can equivalently say that given corresponding unbalanced diagonal elements of $W_ c$ and $W_ o$, $\sigma_ {c}$ and $\sigma_ {o}$, are respectively scaled by $\sigma_ {s}^{-2}$ and $\sigma_ {s}^{2}$ under the balancing diagonal matrix $\Sigma_ {s}$. If these rescaled diagonal elements are to be equal, then:
$$\begin{align} \sigma_ {s}^{-2}\sigma_ {c} &= \sigma_ {s}^{2}\sigma_ {o} \\ \sigma_ {s} &= \left( \frac{\sigma_ {c}}{\sigma_ {o}} \right)^{\frac{1}{4}} \end{align}$$
Therefore, if we introduce symbols for the “unbalanced Gramians” $\Sigma_ {c} = T_ {u}^{-1}W_ {c}T_ {u}^{-* }$ and $\Sigma_ {o} = T_ {u}^{*}W_ {o}T_ {u}$, then $\Sigma_ {s} = \Sigma_c^{\frac{1}{4}} \Sigma_ {o}^{\frac{-1}{4}}$. Through our definition $T = T_ {u} \Sigma_ {s}$, we now have an algorithm for computing the balancing transformation $T$. I promised earlier that we’d truncate $T$ to attain the $n \times r$ version; I’m actually going to coyly leave this as an exercise to the reader as the math isn’t too hard. Remember, the Gramian matrix $\Sigma$ orders the modes by both controllability and observability.
Data-Driven Methods
I’m personally more interested in the high-level concept of balanced models for control than doing a deep dive; suffice it to say that the textbook gives fast methods for computing Gramians empirically (since the Gramians of an $n \times n$ system can be expensive to compute), as well as data-driven techniques for computing a balancing transformation, including balanced POD, the Eigensystem Realization Algorithm and Observer Kalman Filter Identification. These last two, ERA, and OKFID, involve identifying the dynamics of a system by doing an impulse-response experiment.
Conclusion
Next up we’ll start the final section of the textbook, “Advanced Data-Driven Modeling and Control,” with a discussion of MPC! I wrote my thesis on MPC, so I’m especially excited about this part. ‘Till next time!