Anyway, I coded up one of them just now, and here's the result of it applied to the text of Jane Eyre:
It's a kind of clustering algorithm for the tokens. It goes back and forth in an EM-like way between (1) estimating probabilities that a token will have a certain label, given that it occurs in a context with other labels next to it (2) finding the max-likelihood labeling of each token. Except, to make the algorithm not always converge to something like "EVERY TOKEN IS LABEL 3 LOLOLOL" I had to abuse the probability calculation with a rather ad hoc regularization scheme. At this point I'm pretty sure it's no longer really the probability of anything. The dumb script is here, anyway, if you want to look:
At least "he" and "she" ended up in the same class, as well as "the", "a", and "an".