Stochastic LQR: Part 1 – Problem
Published:
What is the best controller for a linear, time-invariant dynamical system affected by process noise, and/or sensor measurement noise?
To define “best”, one must specify a certain performance metric, and a quadratic cost is a widely adopted metric in control and optimization. One possible reason for the common use of quadratic costs is their connection to minimum energy principles in physical systems, and, of course, because having quadratic functions allow us to draw mathematically tractable solutions.
Since the system of interest today is a linear dynamical system, and the cost to optimize is quadratic, the resulting controller (the regulator) is aptly named a Linear Quadartic Regulator (LQR).
Real-world systems are subject to various sources of uncertainty, including sensor measruement errors, process uncertainties, and natural disturbances.
From an analytical perspective, modeling all possible error sources is substantially challenging. But a simplified model can at least provide initial analysis and theoretical insights that can guide us further; I guess the mantra is that a simple model that can be analyzed is better than a complex model (that is likely still incomplete) that can hardly be analyzed.
Without further ado, let’s go ahead and start. Here is the concrete system that we are interested in:
\[x_{k+1} = Ax_{k} + Bu_{k} + w_{k}.\]This is a discrete-time (the index runs like \(k=0, 1, 2, \cdots\)), linear (the difference relation between \( x_{k}\) and \(x_{k+1}\) is linear in \(x_{k}\) and \(u_{k}\) ), and time-invariant (\(A, B \), etc., “the model” is not changing, i.e., \(A\) at \(k=0\) is still the same \(A\) at any other time \(k\)).
Other than the process noise \(w_{k} \sim \mathcal{N}(0, \Sigma_{w})\), which we model it as an independently, and identically distributed (i.i.d.), zero-mean Gaussian process with its covariance \(\Sigma_{w}\), this system is probably as simple as you can get.
The quadratic cost captures the penalty I get at being in a particular state \(x_{k}\) at time \(k\) and the demanded effort (we call it the control input, and use \(u_{k}\) to denote this) for controlling/driving the system towards my desired state. Here is what I am referring to:
\[J := \min_{u_{0}, \cdots, u_{N-1}} \mathbb{E}\large[ \sum_{\ell = k}^{N-1} ( x_{k}^{\top} Q x_{k} + u_{k}^{\top} R u_{k}) + x_{N}^{\top} Q_{f} x_{N} \large],\]and it is the “expected” value (with \(\mathbb{E}[\cdot]\)) because the process is stochastic due to the random process \(w_{k}\). The dangling term \( x_{N}^{\top} Q_{f} x_{N} \) characterizes the terminal cost, the boundary condition.
Today, I presented the problem of the discrete-time stochastic LQR control.
Since my goal is to make each post as brief as possible (I am aiming for each post under 5 minutes), I want to stop here, and continue in the next, as a part of a big series.
Perhaps it can be a little annoying if a reader knows a lot of these materials already, but my target audience is an inexperienced reader, and so I will compromise by posting a fullly connected document at the end of the series.
In the next, I will use the Bellman’s principle of optimality to find the optimal (state-feedback) controller.
Towards the end, we will also study the Linear Quadratic Estiamtor (LQE) (i.e., Kalman Filter).
Along the way we will use the principle of duality, which nicely connects both the LQR and the LQE.