Kalman Filters, They’re Magic!

They’re not really magic, and I’ll probably hate you if you say they are.

The Kalman filter, and variations thereof, are probably the most commonly come across optimal state estimators that you will find. They just work well and mathematically are pretty, well, badass. I’ve seen plenty of people use them despite knowing their noise characteristics, or really having too good of an idea of what the system processes are, and still work decently well when tuned. That said, one of the most common causes for failure of Kalman filtering is due to modeling errors.

Guts of the Kalman Filter

We’ve already talked about a few different flavors of least squares estimators, and the Kalman filter continues to build on these concepts. In the case of the recursive least squares estimator, we now only need to change the symbols we use slightly to denote a priori and a posteriori state estimates and covariances. Oh, and actually have an idea of what the hell it is our system is doing, along with its internal noise process. This then allows us to estimate what the next set of states will be for the system given our model of it, and also incorporates our actual measurements in to it. One of the beautiful things about the Kalman filter is that the Kalman gain term then selectively weighs how much we should trust our model versus our measurement given the measurement statistics.

With the extra complication of having to know how our system evolves in time a la its state transition matrix, we can describe the discrete-time Kalman filter by first modelling our dynamic system as follows:

\begin{aligned}  1) \displaystyle \qquad x_k &= F_{k-1}x_{k-1} + G_{k-1} u_{k-1} + w_{k-1}\\  y_k &= H_k x_k + v_k \\  E \left( w_k w_j^T \right ) &= Q_k \delta_{k-j} \\  E \left( v_k v_j^T \right ) &= R_k \delta_{k-j} \\  E \left( w_k v_j^T \right ) &= 0  \end{aligned}  

Woah, woah, woah. There’s a whole other noise process now that we have to figure out what the hell to do with? That’s right! But fortunately for us the process noise covariance, Q, is often just used as a tuning parameter in the filter. As the process noise covariance is increased, the covariance between samples will increase, thereby making the Kalman filter gain coverge to a larger value, which will then cause a larger emphasis to be placed on measurements. From this it’s easy to see we can use Q to favor or devalue the effect of our measurements if we have the desire to do so.

Moving on to initializing the Kalman filter:

\begin{aligned}  2) \displaystyle \qquad \hat{x}_0^+ &= E(X_0)\\  P_0^+ &= E \left [ (x_0 - \hat{x}_0^+)(x_0 - x_0^+) \right ]  \end{aligned}  

Where the \hat{x}_0^+ refers to the initial a posteriori state estimate, and P_0^+ refers to the initial a posteriori covariance.

With this model and initialization, it’s time to just let the Kalman filter chug along with the following equations at each time step k = 1, 2, etc.

\begin{aligned}  3) \displaystyle \qquad P_k^- &= F_{k-1}P_{k-1}^+ F_{k-1}^T + Q_{k-1}\\  K_K &= P_k^- H_k^T \left (H_k P_k^- H_k^T + R_k \right ) ^{-1} \\  &= P_k^+ H_K^T R_k^{-1} \\  \hat{x}_k^- &= F_{k-1} \hat{x}_{k-1}^+ + G_{k-1} u_{k-1} \\  \hat{x}_k^+ &= \hat{x}_k^- + K_k \left( y_k - H_k \hat{x}_k^- \right ) \\  P_k^+ &= \left (I - K_kH_k \right ) P_k^- \left (I- K_k H_k \right )^T + K_k R_k K_k^T \\  &= \left [ (P_k^-)^{-1} + H_k^T R_k^{-1} H_k \right ] ^{-1}\\  &= \left (I - K_k H_k \right )P_K^- \\  \end{aligned}  

Where \hat{x}_k^- is the a priori state estimate, and \hat{x}_k^+ is the a posteriori state estimate. The first representation of the a posteriori covariance is known as the Joseph stabilized version, and is the most robust, whereas the third version is much more succinct, but is less robust. If the second expression for the Kalman gain is used, so too must the second expression for the a posteriori covariance.

Cool Things to Take Away

In the measurement update equation above, the quantity (y_k - H_k \hat{x}_k^-) is known as the innovations, which is simply the difference between the the actual measurement and the predicted measurement. This is the part of the measurement that contains new information about the state. As you can see, the Kalman filter then weights the innovations by the kalman gain, and then adds it to the a priori state estimate, to finally obtain the a posteriori state estimate. Therefore, the only difference between the a priori and a posteriori state estimate is that the latter takes the new measurement into account, whereas the former is the state estimate according to what our dynamic model predicts. If you’re coming at this from a mobile robotics application, people will generally refer the dynamic model as the motion model, and the state estimate the belief.

I won’t go into the proof, but the Kalman filter is the optimal filter given Gaussian noise processes. Even if the noise processes are not Gaussian, the Kalman filter is still the optimal linear filter.

I really like the little block diagram that wikipedia uses on their page for the Kalman filter; I think it does a pretty nice job of illustrating the flow of the filter.

Kalman Filter Block Diagram Courtesy https://en.wikipedia.org/wiki/Kalman_filter

Another way that I’ve seen the Kalman Filter depicted is with a drawing similar to this one from Mathworks:

In the above image, I would prefer it if the mean for the leftmost, blue, Gaussian was labelled \hat{x}_k^- and the mean for the center Gaussian was labelled \hat{x}_k^+, but I think you get the idea they are trying to convey.

Lastly, I feel like there isn’t much need in writing your own filter from scratch as there are a decent number of libraries out there that you can piggy back off of. If you want to give FilterPy a try, it has a class named kalman that will do basically what you’d like, otherwise there are libraries like PyKalman.

Verifying Your Kalman Filter Isn’t Broken and Final Thoughts

One of the easiest things to do to verify the performance of your Kalman filter is to monitor what the innovations are doing. The innovations should be a white noise process with zero mean with a covariance of

4) \displaystyle \qquad \left (H_kP_k^-H_k^T + R_k \right )  

If we see that the innovations appear to be colored, nonzero-mean, or has the wrong covariance, then there is something wrong with the filter; this is usually due to an error in modelling.

Oh, also make sure your error covariance matrix, P is converging (kind of important).

Sometimes to tie a bunch of the concepts together people will refer to Kalman filters as Linear Quadratic Estimators, simply because we assume a linear system, and our cost minimization function is quadratic.

Until next time, happy filtering!

Leave a Reply

Your email address will not be published. Required fields are marked *