In short, the conjecture is: for any N, the act of doing N iterations of the Metropolis-Hastings sampling algorithm is Lipschitz-continuous (specifically, 2-continuous), by which I mean continuous in the

*potential function*considered as an input, with an L

^{∞}metric on the space of potential functions, and a log-L

^{∞}metric on the probability distributions that the algorithm outputs.

Quick intro to MH:

What MH is good for is sampling from a distribution where I "kind of know" the probability distribution, but only in a funny way: I've got a "potential function" V from the integers to the reals. This is going to give a probability distribution over the integers. V says how unlikely various outcomes are. The bigger the V(x), the smaller probability I choose integer x. Specifically, I want to draw a sample from the distribution where the probability of x is

exp(-V(x)) / sum_y exp(-V(y)) ("Boltzmann distribution")

That big sum over all integers y is there to make sure that all the probabilities add up to 1. This normalizing factor, even for other applications when it isn't blatantly infinite, is usually horrendously exponentially hard to compute.

So instead of computing this directly, we do a biased random walk, starting at zero:

1. Let p = 0

2. Flip a coin to pick d = -1 or 1. This is a "proposed step" to the left or right.

3. Set q = V(p + d) - V(p). This is the potential difference between our current state and the proposed state.

4. If q is negative, set p = p + d and go to step 2.

5. If q is positive, choose a random number z from the exponential distribution. (Sampling from the exp. dist. is easy: just choose a number w uniformly in [0,1] and set z = -ln w)

6. If z > q then set p = p + d and go to step 2. The intuition here is that q is a "potential energy hill" we have to climb up, and z is a random kick of energy we get from thermal background noise. If z is bigger than the energy hill, we still get to go up.

7. Otherwise, just leave p as it is, and go to step 2.

And so you just pick a big N, and stop after N iterations and output whatever p happens to be at the end. This converges on the Boltzmann distribution above.

Okay, so conjecture:

If you change the all the values of the potential function independently by no more than c, then after N iterations of Metropolis-Hastings, the probability distribution is changed (multiplicatively) by no more than e

^{2c}.

That is, suppose I have two different potential functions V and V', and I say MH(V,N,x) is the probability that with potential V after N iterations that p=x. Then:

Assuming that for all x I have |V(x)-V'(x)| < c, then I will have for all x and all N that

|(ln MH(V,N,x)) - (ln MH(V',N,x))| < 2c

The reason I suspect this to be true is that (a) it's true for the distribution MH converges to --- this is exactly the correctness of the "exponential mechanism" in the differential privacy literature, which is why I care about this conjecture in the first place! and (b) doing some numerical experiments in mathematica failed to find a counterexample.

But I really don't know how to even get started proving it. I look at the first few iterations, and the algebra gets horrible, and I don't see any invariants to latch onto.