## November 11th, 2012

### (no subject)

I find this story/explanation to be interesting, because it casts frequentist methods and bayesian methods as somewhat orthogonal to one another, or that they are about conflicting, incommensurable optimization goals:
http://stats.stackexchange.com/questions/2272/whats-the-difference-between-a-confidence-interval-and-a-credible-interval/2287#2287

Both the (allegedly) frequentist character and the (allegedly) Bayesian character in this story want to do the same kind of thing: they want to choose a theory, a subset of the full crossproduct of observations (chips in cookie) and hypotheses (jar cookie came from) such that simultaneously
1) the size of the subset is small - this is a measure of the informativeness of the theory, the strength and specificity of the things it does claim
2) something else is sufficiently big - a measure of how accurate the theory is
but they differ on the choice of what should be maximized in (2). For the frequentist it's the column sums, for the Bayesian, it's the (suitably normalized) row sums.

Some things I like about this story are that it's entirely finite and discrete, appealing to the computer scientist in me, and it's easy to see how it generalizes to the continuous case, and that it covers an interesting edge case, one where the frequentist claims the empty disjunction, the 0-chip row. To the Bayesian, this is abhorrent: it has zero credibility! There is zero probability that the chip came from no jar! And yet, to the frequentist, it's a virtuously strong claim that ⊥ holds in the observation of 0 chips (which is a great claim by measure (1) above --- no disjuncts are specified at all!) and yet it's part of a larger, systematic theory that makes correct predictions 70% of the time.

Would again love to hear knowledgeable people chime in as to whether they think this is a fair characterization or not.