The original canonical paper seems to be "Magic Sets and Other Strange Ways to Implement Logic Programs", from 1986(!), but I find it hard to really suss out what was going on. Clearly it was about optimizing a program for a given query, and the example made a sort of sense, but the algorithm was specified in terms of some weird graph structure that I didn't have the patience to understand. There are red flags that go off in my brain also when someone seems to be doing too much explicit case analysis on bound vs. free variables.
A much more friendly paper was "The Declarative Side of Magic" which made it clearer that magic sets are about sneaking in some of the advantages of backward search while still doing forward logic programming. It's also clear that modes matter in the source logic program. I very much appreciated the example at about page 10 of the pdf where the program before compilation has two different predicates in it, listsum and sum, and you get to see how they transform into listsum' and sum', which are basically requests to compute listsum and sum.
Also, my god, why have I not been using CiteULike all my life. I mean, at least the part of my life after which it was invented. It looks really frickin' useful, like del.icio.us for papers except that in practice research papers have a much crisper ontology about them.