Jason (jcreed) wrote,

The following thought popped into my head in a strangely fully-formed sort of way, as if I had been thinking about it subconsciously for hours while asleep. Quite possibly someone (susan? :) will chime in and tell me this has all been done seventy years ago, but here goes anyway:

If you present pairs of points to people and ask them to judge which is vertically above the other, they're bound to do worse (i.e. be closer to random guessing) if the points are horizontally very far away from each other, right? And similarly if you rotate the whole setup by 90 degrees or whatever? I imagine this is a totally standard human-perception psych experiment that's been replicated a billion times.

Note then that if we choose two random points according to a, say, isotropic Gaussian distribution, the distribution of the distance between the two random points depends in some known way on the dimension of the ambient space. Most points in a high-dimensional ball are on the surface, right?

So, if we do a giant experiment with rating books/movies/whatever with pairwise comparisons ("Do you like book X more than book Y?") we're going to observe certain patterns of unreliability: if the "true ranking" of two books are close, then maybe the next time you ask even the same person to compare them, you get the opposite answer. Or another perhaps structurally similar thing to observe is that across different people, two books that are on average about as liked as one another could be more or less contentious. In other words, there's a difference between two books that absolutely everyone agrees that X is just a tiny bit better than Y, and two books for which everyone is totally polarized but just a slightly larger fraction of people violently prefer X compared to the people that violently prefer Y.

The main question is: could we get out from all these statistics an estimate of what the "dimension" is of the space of books, movies, etc.? So that the pattern of unreliability for judging the relative position of points in a gaussian spread over that dimension looks kind of like the observed pattern of unreliability of rankings? I don't necessarily think that this judging-gaussian-distributed-points assumption is accurate, but it might be meaningful enough to pull out a single number.

The thing I'm interested in, the possibly meaningful new information, if the answer to the above question is yes, is whether books, as a medium, or movies, as a medium, or people on dating sites, or restaurants, or web pages, etc. have higher or lower dimensions than one another.
Tags: math

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded