Or connect using:
 Here's a cute little decision-theoretic puzzle that I fished out of a… - Notes from a Medium-Sized Island [entries|archive|friends|userinfo]
Jason

 [ website | My Website ] [ userinfo | livejournal userinfo ] [ archive | journal archive ]

[Apr. 19th, 2010|12:03 pm]
Jason
 [ Tags | decisions, math ]

Here's a cute little decision-theoretic puzzle that I fished out of a book:

Suppose you have a starting blob of money, say X dollars. Each day, you choose how to divide your money between consumption and investment. If you consume Y dollars, then you get ln(Y) utility out of that act of consumption. If you invest Z dollars, (really, Z = X - Y; any money you don't spend automatically gets "invested") then you get back R*Z dollars on the next day, where R is a number given to you by the universe, indicating the return you can manage to get on investments. Finally, we make the assumption that you're impatient: the actual value of you today of a util earned far in the future, say on day T, is really only D^T, where D is some "discounting factor", a number less than (but perhaps quite close to) one.

Given that, what strategy optimizes the sum of the present values of all future rewards?

So the book said it was to always consume (1-D)X and to invest DX, which surprised me, because it was totally independent of R!

Suppose the optimal strategy is at least of the above form; always to consume some fixed fraction k of my current money. The answer is ostensibly that k = 1 - D.

Let's abstract away from my logarithmic utility function for a moment and say that spending Y dollars gives me f(Y) utils. Then my total utility function is

f(kX) + D f(kXR(1-k)) + D^2 f(kX(R(1-k))^2) + ... + D^n f(kX(R(1-k))^n)

because immediately I consume kX, but I keep (1-k)X around, which grows to XR(1-k), and then I consume k of that, giving me f(kXR(1-k)) utility, but that all happens tomorrow, so it only counts as D f(kXR(1-k)) to me today. Then on the next day, the XR(1-k)^2 that I still invested becomes XR^2(1-k)^2, and I consume k of it, giving me f(kX(R(1-k))^2) utils, but that's two days into the future, so it really only counts as D^2 f(kX(R(1-k))^2). And so on.

To optimize, I take the k-derivative and set it to zero. Let me just rewrite the above compactly as
sum_n D^n f(kX(R(1-k))^n)
and then what I want to solve for is
0 = sum_n D^n R^n X f'(kX(R(1-k))^n) ((1-k)^n - nk(1-k)^(n-1))
Now if f is ln, then the derivative is just the reciprocal, and things cancel out a lot:
0 = sum_n D^n (1 - k - nk) / (k(1-k))
and we can multiply both sides by k(1-k) to get
0 = sum_n D^n (1 - k - nk)
Now I'm going to use the fact that sum_n D_n = 1/(1-D) and sum_n n D_n = D/(1-D)^2 to get
0 = (1 - k)/(1 - D) - kD/(1-D)^2
hence
0 = (1 - k)(1-D) - kD
and so
k = 1 - D

There you have it! I guess both the logarithmic utility and the exponential discounting of utility over time are pretty sketchy assumptions, but it's interesting it works out this way. It's still completely weird to me that it doesn't depend on R.

 From: 2010-04-19 06:11 pm (UTC) (Link)
I was worried that it would end up being trivial - if DR<1 then the optimal strategy would be to consume it all at once, while if DR>1, then there is no optimal strategy but smaller positive values of k give better outcomes. I wonder if the fact that the utility function is logarithmic is relevant here.
 From: 2010-04-19 06:13 pm (UTC) (Link)
Actually, if we take the log of everything, then we get a linear utility function, and we see that discount is just subtraction and the interest is constant addition. I still don't see why it ends up actually depending on D but not R.
 From: 2010-04-19 06:15 pm (UTC) (Link)
I thought that at first too, but the discounting factor D itself still hides outside the log.
 From: 2010-04-20 06:46 am (UTC) (Link)
Hm, I think this problem was deliberately designed to give a simple unintuitive result, since the logarithmic utility is exactly what you need to make the R, X, and k (here k is your strategy vector, you don't need to assume a constant strategy) terms of the total utility function completely independent. So R and X just drop out and you have only to find an optimal k given D. Then, the fact that the optimum investment strategy doesn't change based on X tells you you should do the same thing every day, and the rest is as you have shown.