December 7th, 2012

beartato phd

(no subject)

Open-ended machine learning/regression question:

Imagine the kind of situation where you have a pile of training data that consists of pairs


whose intended meaning is what you'd expect for a system that computes chess scores and the like: Alice beat Bob at some competitive task, say, chess, then Bob beat Carol, then Carol beat Bob, then Bob beat Carol again. So I'm aware of there being lots of work on designing scoring systems like this, so that score reflects how likely Alice is to beat Carol in the future, even though Alice has never played Carol before.

But suppose you have lots of other features that you can extract from the entities being compared, (e.g., I have age and hair color and average amount of chess books read per month, for Alice and Bob and Carol and Doug and Emily) and you want to predict the outcome of a match between Doug and Emily --- who are not featured in the training data at all.

Is there a name for this kind of task? Is there a good principled answer/algorithm/approach/whatever? It's weird because you have training data, but it's only indirectly about the function you seem to want to compute, which is some kind of function from person to real numbered score.