[HN Gopher] How a Kalman filter works, in pictures
       ___________________________________________________________________
        
       How a Kalman filter works, in pictures
        
       Author : jack_riminton
       Score  : 266 points
       Date   : 2021-12-07 15:13 UTC (7 hours ago)
        
 (HTM) web link (www.bzarg.com)
 (TXT) w3m dump (www.bzarg.com)
        
       | jvanderbot wrote:
       | The best exercise I had in our sensing & estimation class was to
       | derive the Kalman filter from 'scratch'. It's actually just the
       | series of steps used to perform a minimization of a convex cost
       | function, where the cost function is the inverse likelihood of
       | the measurements given the data.
       | 
       | Someone once joked that all of sensing and estimation (and path
       | planning and optimal control) is just applied convex
       | minimization.
        
         | NougatRillettes wrote:
         | Nice. I wrote up another possible derivation here
         | https://ngr.yt/blog/kalman/ in case anyone is interested.
        
       | rland wrote:
       | This article is fantastic.
       | 
       | For those who have some familiarity with Python, I found this to
       | be a great resource for Kalman Filtering:
       | https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Pyt...
        
       | albinofrenchy wrote:
       | I've been considering writing up something for extended kalman
       | filters like this. Kalman filters are great but in my experience
       | extremely limited without the non-linear extensions which make
       | them useful in a much wider variety of problems.
        
         | marton78 wrote:
         | Your should have a look at sigma point Kalman Filters.
        
       | 6gvONxR4sf7o wrote:
       | Where KFs finally clicked for me was in deriving the recursive
       | form of good old fashioned linear regression. You can do linear
       | regression a data point at a time, or a batch at a time, and
       | while I can't remember if it works out to _exactly_ the same
       | thing (because of the recursive state), it works out to
       | essentially the same. Dynamic bayesian networks are a superclass
       | of KFs and they 're a great thing to have in your tool kit if you
       | work with data.
        
       | chmaynard wrote:
       | (2015) https://news.ycombinator.com/from?site=bzarg.com
        
       | east2west wrote:
       | The most thorough explanation of Kalman filter as recursive least
       | square coupled with linear state-space models is "Linear
       | Estimation" by Thomas Kailath, Ali H. Sayed, Babak Hassibi. It
       | covers robust numerical algorithms such as square-root Kalman
       | filter. Unfortunately, it is a big book and reading it is going
       | to be a big time investment, but I believe it will be a
       | worthwhile investment. It develops in depth various linear
       | algebra concepts and demonstrates profound application to linear
       | least square. This book is what people mean when they say learn
       | from masters.
       | 
       | It is also the only technical book I have read that aptly quotes
       | Shakespeare: "Age cannot whither her, nor custom stale her
       | infinite variety."
        
       | michelpp wrote:
       | Michel van Biezen has an excellent video series on Kalman
       | filters:
       | 
       | https://www.youtube.com/watch?v=CaCcOwJPytQ
       | 
       | He walks you though all the way from the rationale, to a simple
       | algebraic example, to the full blown Linear Algebraic Matrix
       | solution. It's really worth watching even just the first couple
       | videos to get a deeper understanding of this very useful
       | technique.
        
         | moffkalast wrote:
         | > The Kalman Filter (video 1 of 55)
         | 
         | That seems... excessive.
        
           | persedes wrote:
           | it's not 55 videos (dunno if he forgot to upload them or
           | misnumbered them...). Also most of it (I'd say 70%) is him
           | doing matrix multiplications on a whiteboard. Which is
           | excellent if you need a refresher in linear algebra (or just
           | suck at it ;)), otherwise you can skip a lot of the content.
           | That aside, he does an excellent job explaining it.
        
       | quda wrote:
       | The article starts nicely but soon ends up in the mud of math
       | mysticism: K'=PkHTk(HkPkHTk+Rk)-1.... Useless.
        
       | rightbyte wrote:
       | "I have to tell you about the Kalman filter, because what it does
       | is pretty damn amazing."
       | 
       | Not really, no. Kalman filters are the optimal way of removing
       | white noise from simulations you yourself are adding.
       | 
       | Good luck getting the covariance matrix right in practice.
       | Usually it is diagonal and essentially a simple low pass filter
       | with a fancy name.
        
         | dhdc wrote:
         | The actual power of Kalman filter lies in its ability to
         | estimate states/variables that can't be measured directly, not
         | just filtering time-series data.
         | 
         | With Kalman filter, you not only get your output filtered, but
         | along with the state of the system at each step as well. No
         | amount of low pass filtering can achieve that.
        
           | rightbyte wrote:
           | Ye sure, but you can have a state space representation
           | without doing a Kalman filter.
        
             | wenc wrote:
             | But you cannot estimate values of missing states
             | (unmeasured) with just a state space representation even
             | under conditions of observability. You need a state
             | estimator for that, and the Kalman filter such an
             | estimator.
             | 
             | Also, while it's true the covariance matrix is initialized
             | with diagonals (we often don't have good priors), the
             | values are being dynamically re-estimated from measurements
             | at every iteration. The initial parameters are just that --
             | initial.
             | 
             | If certain assumptions are true, ie Gaussian noise and LTI
             | model is approximately correct, the covariance matrix
             | converges to a stable covariance matrix that reflects the
             | current reality. Some of those assumptions are relaxed with
             | EKFs and UKFs but they're essentially built on a Kalman
             | framework (the competing one being MHE).
             | 
             | The Kalman filter is a state estimator, not just a filter.
             | And it is used in industrial applications with advanced
             | control systems (MPC).
        
               | rightbyte wrote:
               | > Also, while it's true the covariance matrix is
               | initialized with diagonals (we often don't have good
               | priors), the values are being dynamically re-estimated
               | from measurements at every iteration. The initial
               | parameters are just that -- initial.
               | 
               | The P covariance matrix does not depend on online input
               | though, right? So the initial value of covariance matrix
               | Q and R together with model H and F decides what P will
               | end at. I mean, there is no online parameter estimation
               | (dependent on input). Or am I getting it wrong?
               | 
               | I sure agree to that Kalman filter could be useful. But I
               | don't like how they are presented.
               | 
               | Like the author of the blog writes: "At times its ability
               | to extract accurate information seems almost magical"
        
               | wenc wrote:
               | You're correct. The Kalman gain in the linear case can be
               | computed offline since it's only a function of model
               | parameters. The P matrix is updated at every iteration,
               | but it is also only a function of Q and R (which is
               | determined offline) and the previous P, which at time 0
               | is initialized with diagonals. P does represent the
               | covariance of (xpred - xact) but it doesn't take in
               | inputs at every iteration. I appreciate that call out.
               | 
               | Like most things, in practical implementations there's a
               | bunch of extra things that go beyond the basic theory you
               | have to do like reestimating the initial P every few
               | iterations as well as reestimating Q and R with online
               | data.
               | 
               | It's not magic and yes it is a form of recursive linear
               | regression but it does work in real life.
        
               | akumahh wrote:
               | Yes the P covariance matrix does depend on online input,
               | anyone who has ever used them would know so perhaps you
               | have no idea what you are talking about.
               | 
               | If instead of making such posts you would like to learn
               | more then: The P matrix is changed both through the model
               | and the Kalman gain, where in both of these steps it
               | depends on online input (model can depend on the state)
               | and the Kalman gain depends on sensor measurements.
        
             | dhdc wrote:
             | Okay, but its not very useful is it? Sure, you can
             | propagate the state vectors directly, but even tiny amount
             | of noise may be amplified and can make the results useless.
             | Also, read up on why Euler Method is bad for solving ODEs
             | numericaly.
             | 
             | This is exactly why we use Kalman filters.
        
           | pvarangot wrote:
           | > No amount of low pass filtering can achieve that.
           | 
           | Can't a simple IIR "low pass" do that for certain systems
           | once they reach a steady state? I don't remember exactly when
           | the KF becomes an IIR but intuitively for certain systems I
           | kinda think the constants would stabilize after a while and
           | then if you just use those you get to optimality if the
           | system stabilizes.
        
           | edge17 wrote:
           | Going down the rabbit hole indeed... Kalman filters are
           | Linear Quadratic Estimators (LQE) in control theory. Very
           | rich field with very powerful techniques.
        
         | ska wrote:
         | > Good luck getting the covariance matrix right in practice.
         | Usually it is diagonal and essentially a simple low pass filter
         | with a fancy name.
         | 
         | I think here you are confusing initialization of an estimator
         | with the estimation process.
        
           | rightbyte wrote:
           | I was referring to Q and R by "diagonal covariance matrix".
           | Forgot P was also a covariance matrix.
           | 
           | But ye, control theory was over my head in university so ...
           | I might be more frustrated with how simple and nice PI-
           | controllers and first order IIR filters are used in industry
           | while we were butting our heads bloody against the wall in
           | university.
        
             | ska wrote:
             | Industry uses a lot of things. Including Kalman filters,
             | they are actually pretty common.
             | 
             | It's true that a lot of industry problems can be solved by
             | linear regression, simple filters, Gaussian distributions,
             | etc. Which can make you wonder why you bothered about all
             | that stuff in your degree - but it's also true that there
             | are a lot of interesting problems in industry that just
             | aren't amenable to the most basic approaches.
             | 
             | It's also true that for the most part, bashing your head on
             | this stuff in university improved your ability to think
             | about the problem spaces.
        
         | minihat wrote:
         | Dear HN: Please don't let this comment discourage you from
         | reading the article.
         | 
         | Kalman filters are super useful tools across many domains.
         | 
         | They are a primary instrument in weather forecasting, robotics,
         | finance, and more. Data you see every day have been touched by
         | this technology.
        
           | vlovich123 wrote:
           | To add onto this, it's used in multiple places in GPS and
           | other positioning problems (as in I was on the CoreLocation
           | team at Apple years ago and Kalman filters were common). I'm
           | not really sure where the commenter is sourcing their claim
           | but my experience directly contradicts it.
        
           | rightbyte wrote:
           | The article is very good and pedagogical.
           | 
           | My take on Kalman filter is that they are, with a diagonal
           | regression matrix and precomputed parameters, just a
           | convoluted notation for guesswork. It is abit like drawing
           | Nyquist diagrams for system stability analysis - mostly an
           | academic excercise. And stuff like that plagues control
           | theory. I would rather that students learned to keep it
           | simple.
        
             | WastingMyTime89 wrote:
             | Kalman filters are literally everywhere in the industry. If
             | there is a radar or data fusion involved, you can be pretty
             | sure there are Kalman filters. I know a researcher whose
             | most quoted article is just him applying fancy new methods
             | to actual industrial datasets and showing they perform
             | worse than a Kalman filter.
             | 
             | What you wrote is akin to someone explaining to students
             | doing signal processing that they should stay away from
             | Fourier transforms.
        
             | glial wrote:
             | > just a convoluted notation for guesswork.
             | 
             | You could say that for all of estimation, by definition.
             | But some estimates are better than others, and the KF is
             | the best estimate under certain conditions...
             | 
             | ...and one of those conditions is that you have a good
             | estimate of the dynamics and measurement noise parameters.
             | Rather than throw our hands up, we should just articulate
             | this, and proceed to discuss methods for getting a good
             | estimate of noise parameters, and discuss what happens if
             | our estimates are wrong.
        
           | ianai wrote:
           | Thanks for your contextualization. The above comment really
           | did hurt my desire to learn more about Kalman filters. I know
           | just being negative or contrary to a thing has an
           | unreasonably high return on investment in making a commenter
           | look smart or authoritative, but it sure does harm.
        
         | LeftHandPlane wrote:
         | Low pass filters can have significant phase delay, whereas
         | Kalman filters can be made to have essentially no lag (phase
         | delay). Additionally, Kalman filters can stabilize the estimate
         | of a state based off of other correlated measurements much
         | better than simply low pass filtering the measurement.
         | 
         | For example, in an IMU, the accelerometers may be noisy due to
         | vibration from the aircraft or vehicle, and the magnetometer
         | measurements which are not affected by vibration, can be used
         | to stabilize the inclination estimates.
         | 
         | Kalman filters are extremely useful and enable applications not
         | possible with just low pass filtering.
         | 
         | Also, you don't need to get the covariance matrix "exactly
         | right", these are tuned in practice on actual measurement and
         | can be used to speed up or slow down the state estimation or
         | weighting of different measurements.
        
       | version_five wrote:
       | I expect people to disagree with me, but it's very hard to get
       | past the condescending language in the article:
       | 
       | > Totally neat, crazy correlations, scary math, pretty pictures,
       | shiny pictures
       | 
       | It's really hard to read past this. I've seen this tone
       | occasionally from junior academic lectures who somehow think that
       | talking to people like they are nervous children will help put
       | them at ease. I don't think the author's doing this malociouly,
       | but it comes across as "I'm so smart, but don't be afraid, I'm
       | dumbing it down to a cutesy level you can understand"
       | 
       | Avoid writing like this if you can, it's clear from the comments
       | the article is otherwise very good
        
         | yesenadam wrote:
         | Of course I disagree. Your comment, which advises people not to
         | write like the article, starts with phrase it's a waste of time
         | to read. Then it misquotes the article--I was, naturally,
         | expecting to see that quote in the article. It repeats a
         | phrase, apparently for effect. "Malociouly"? You seem to see
         | signs of humanity in the writer only as condescension and
         | vanity. Your "Avoid writing like this" applies infinitely
         | better to your own comment, I think.
         | 
         | You seem to assume all writing should be dry academic writing,
         | without vivid language or friendliness to the reader.
        
         | bigdict wrote:
         | I agree. From personal experience, this sort of phrasing comes
         | during a temporary high that you get after mastering a topic.
         | Conquering a piece of technical material elevates you, and in a
         | sense you want to be condescending towards your ignorant past
         | self.
         | 
         | It's bad taste.
        
         | beaconstudios wrote:
         | it's just the pop-sci writing style. You see this with lots of
         | articles and content directed at a general audience - I think
         | this author's just reused the same style to talk to a technical
         | audience. It's not inherently condescending, it just comes
         | across that way if you're writing for an audience of technical
         | peers.
         | 
         | this kind of breathless style makes more sense when you don't
         | expect the audience to be technically literate and the content
         | is there to make them say "wow, I don't understand anything
         | you're saying but there certainly are technical words in here!"
         | - see, for instance, popular media around anything related to
         | quantum mechanics.
        
         | dang wrote:
         | " _Please don 't pick the most provocative thing in an article
         | and rush to the thread to complain about it. Find something
         | interesting to comment about instead._"
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
       | maCDzP wrote:
       | I think of it as an algorithm for Bayes theorem - is that right?
        
         | iamcreasy wrote:
         | Bayes filter - yes.
        
       | gyre007 wrote:
       | Couple of years ago I blogged about how it was developed for the
       | Apollo program [1] and then I wrote an implementation of
       | different variations of it in Go [2].
       | 
       | [1] https://cybernetist.com/2019/01/13/apollo-kalman-filter-
       | and-...
       | 
       | [2] https://github.com/milosgajdos/go-estimate
        
       | leeoniya wrote:
       | somewhat related fft/data smoothing:
       | 
       | https://dawn.cs.stanford.edu/2017/08/07/asap/
       | 
       | made a recent demo vs a moving avg:
       | https://leeoniya.github.io/uPlot/demos/data-smoothing.html
       | 
       | moving avg takes a bunch of samples to converge, while ASAP does
       | it much faster, which seems to also be a key property of these
       | Kalman filters as well?
        
       | jiggawatts wrote:
       | A related and much more powerful predictor in a noisy environment
       | is the Particle Filter, which uses a mechanism that is
       | particularly appealing to me as a programmer...
        
         | tbabb wrote:
         | Author here. The particle filter has its own strengths and
         | drawbacks. It makes more sense to use a particle filter in
         | situations where the state search space is highly nonlocal
         | and/or nonlinear, for example locating a drone on a map by
         | matching radar features to topography.
         | 
         | If the process is linear and the estimation error is Gaussian
         | (or approximately so in practice), the Kalman filter is known
         | to be the optimal algorithm, and the particle filter would not
         | only perform worse, but be more expensive to implement.
        
       | artemisyna wrote:
       | I appreciate how this post gets resubmitted every couple of
       | years. It's been a go-to bookmark for ages now and really does
       | help give some visual intuition for what's happening.
        
       | mrguyorama wrote:
       | Unfortunately for me, the intuitive part of Kalman filters is
       | easy, but I don't have nearly the kind of grasp of Linear algebra
       | I would need to implement one (I think).
       | 
       | Are there drop in, batteries included, ready to go kalman filter
       | implementations/frameworks for common microcontrollers like
       | arduino and raspberry pi 2040? Or is it infeasible to implement
       | them in limited setups?
        
         | east2west wrote:
         | I believe ROS (Robotic operating system) has good
         | implementations of state estimation algorithms. If you are
         | worried about memory footprint, then Durbin and Koopman ("Time
         | Series Analysis by State Space Methods") has a scalar version
         | of squared-root Kalman filter (it ingests one number at a time
         | rather a whole vector at a time). You may have to implement it
         | yourself though.
        
         | sjburt wrote:
         | I think you still need a good grasp of linear algebra and
         | difference equations to identify state variables and correctly
         | set up the "model" or "plant" matrix, this is specific to the
         | system so it can't be provided by the framework. If you can do
         | this, the rest of the Kalman filter is straightforward and can
         | easily be done in numpy etc.
        
         | dbcurtis wrote:
         | Totally feasible in microcontroller for small ones and modest
         | update rates, yes. I have worked with robots that had them
         | implemented in slow 8-bit microcontrollers. (I did not do the
         | KF, though).
        
         | dr-ando wrote:
         | The computational requirements are very modest. The magic is in
         | the math. I'm not sure if it counts as "batteries included" but
         | I wrote a Kalman filter implementation in "no-std" (no standard
         | library) rust called adskalman [1]. This means it can run on
         | very modest bare metal targets with no operating system. Of
         | course, it can also run on targets where the standard library
         | is available and the examples [2] make use of this to do nice
         | things like print the results which can be piped to a file and
         | plotted. The core runs fine on embedded targets, and we use
         | this with arm microcontrollers, but it should work on just
         | about anything. Feedback is welcome.
         | 
         | [1] https://crates.io/crates/adskalman [2]
         | https://github.com/strawlab/adskalman-rs/blob/main/examples/...
        
         | jacksonkmarley wrote:
         | > Unfortunately for me, the intuitive part of Kalman filters is
         | easy, but I don't have nearly the kind of grasp of Linear
         | algebra I would need to implement one (I think).
         | 
         | This is me, anyone got a good linear algebra resource to
         | recommend? Ideally one that addresses 'I took linear algebra
         | ages ago in uni but it didn't really stick'.
        
           | east2west wrote:
           | I really like "Linear Algebra and Its Applications" by
           | Gilbert Strang. It is not a textbook, the style is
           | conversational, and he really tries to help you learn. It is
           | one of rare math books that includes reason, context,
           | application, and history without sacrificing rigor. The book
           | also focuses on numerical algorithms aspects more than some
           | popular textbooks, which may be helpful to understanding
           | Kalman filter.
        
             | jacksonkmarley wrote:
             | Thanks for the suggestion!
        
       | mgaunard wrote:
       | The way I've always seen a Kalman filter is as a recursive least
       | squares fitter, but instead of being run on the same input until
       | convergence, it is run one step on every new sample of ever-
       | changing inputs, in effect leading to a smoothing fitter.
       | 
       | For some reason I rarely see it being described as such.
        
         | nextaccountic wrote:
         | What's the "recursive" part? It's like the difference between
         | IIR and FIR?
        
           | gugagore wrote:
           | That is an astute connection. The "recursive" part is that
           | you do not need to keep all the history of observations and
           | actions (often denoted y and u. in control theory). You can
           | summarize the past with sufficient statistics (the mean and
           | the variance for linear quadratic gaussian (LQG) problems).
           | 
           | You could imagine keeping all of the data and fitting a model
           | to that. For LQG problems, you can use dynamic programming
           | [1] to solve the problem faster.
           | 
           | You could alternatively imagine keeping a finite window of
           | data and fitting a model to that. That filter would have a
           | finite impulse response.
           | 
           | [1] https://en.wikipedia.org/wiki/Dynamic_programming btw
           | that term kind of means two things, and this is a situation
           | in which the two meanings overlap.
        
         | ephimetheus wrote:
         | We also use it for fitting charged particle tracks in high
         | energy physics experiments.
         | 
         | Here, we're iteratively incorporating measurements along the
         | trajectory, finally arriving at basically a least squares fit
         | result, after a backward smoothing pass.
        
       | sklargh wrote:
       | When I started working with aviation people early in my career,
       | learning that radar only has a fairly good guess of where you are
       | was pretty amazing.
        
       | chillingeffect wrote:
       | down atm. https://archive.md/cLjvZ
        
       | xaedes wrote:
       | Welford's online algorithm for computing weighted mean is
       | essentially just Kalman filtering: Weights are the inverse of
       | measurement covariance, i.e. the information, the initial state
       | is the first measurement, state transition and observation model
       | are 1x1 identity matrices.
        
       ___________________________________________________________________
       (page generated 2021-12-07 23:00 UTC)