Log in

No account? Create an account
Found a thing in one of my old notebooks about how to think about… - Notes from a Medium-Sized Island [entries|archive|friends|userinfo]

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

[Mar. 22nd, 2016|04:28 pm]
[Tags|, ]

Found a thing in one of my old notebooks about how to think about Kalman filters in a super simple way that never requires even saying the phrase "square root of pi" or the word "variance" explicitly.

Here's a very simple version of the problem to be solved:
I have fuzzy knowledge about where something (a robot, maybe) is, on the real line. This fuzzy knowledge takes the form of a Gaussian distribution. The probability density that my robot is at point x is proportional to exp(-A(x - a)^2) for some constants A and a. I said "proportional to", because I'm going to be ignoring normalizing constants throughout. Now the robot reads from a noisy sensor, which says that its position is b. I know the sensor has Gaussian noise, so the probability it says b if the position really were x is proportional to exp(-B(x - b)^2).

What's my posterior belief about where the robot is? Bayes says Pr[position = x | sensor says b] = Pr[position = x and sensor says b] / Pr[sensor says b]

I note that Pr[sensor says b] is some big ugly integral, but it's a constant not depending on x, so I'm going to ignore it and compute Pr[position = x and sensor says b].

That's just (marking with ~ every time that I'm multiplying or dividing by a constant independent of x)
exp(-A(x - a)^2) exp(-B(x - b)^2)
= exp(-(A(x - a)^2 + B(x - b)^2))
= exp(-(Ax^2 - 2xaA + Aa^2 + Bx^2 - 2xbB + Bb^2))
~ exp(-(Ax^2 - 2xaA + Bx^2 - 2xbB))
= exp(-((A+B)x^2 - 2x(Aa + Bb)))
= exp(-(A+B)(x^2 - 2x(Aa + Bb)/(A+B)))
~ exp(-(A+B)(x - (Aa + Bb)/(A+B)))

which is another gaussian, whose mean is (Aa + Bb)/(A+B), which looks very nicely like a weighted sum of the means of the two inputs, weighted by constants that are (we secretly know even though we promised not to say "variance") proportional to the inverses of the variances. (off by a factor of 2)

And this generalizes fine in the n-dimensional case to carrying around the inverse of the covariance matrix as standing for an n x n chart of the "number of votes".

I wish someone had taught this whole business to me with inverse-variances as first-class concepts.

From: eub
2016-03-23 07:30 am (UTC)
That's cool.

Let's see, is there a more feely way to see the product of Gaussians than going into the log domain where we know that quadratics sum to a quadratic? Where you can see the position of the apex by making the slopes cancel, and see the width because you know the derivative lines just add.
(Reply) (Thread)
[User Picture]From: jcreed
2016-03-23 12:31 pm (UTC)
Ooh I would like it if there were, but I'm not able to see how to tell that story at the moment.
(Reply) (Parent) (Thread)
From: eub
2016-03-25 04:36 am (UTC)
It keeps bugging me, it's really counterintuitive to me that those multiply to a Gaussian! The location of the maximum, sure, that makes sense. But it's a little hard to believe that the side arms of the product come out right and exp(-quadratic). Because look, you're multiplying an exp(-quadratic) arm by a thing with a bump over there, doesn't that create a deviation?

So the exp/log transform busts my intuition.
(Reply) (Parent) (Thread)