Sunday, August 2, 2009

Wisdom of Statistical Crowds

Jordan Ellenberg is a mathematician I adored based on a talk on philosophy and mathematics that he gave at Caltech six years ago, who turned out to also be a delightful Slate columnist and author, and who has a blog now! (Thanks for the heads up, Ben.) Also, his list of books read is what inspired me a couple years ago to keep track of what I read, which is why you are periodically subjected to sets of three book 'impressions' (not exactly reviews) on my own blog.

Anyway, I liked this observation of his on the netflix contest (to improve their recommendation system by 10%):
One of the really interesting lessons of the competition is that blendings of many algorithms seem to work better than any single algorithm, even when there’s no principled reason to do the blend. It’s sort of a “wisdom of crowds of computer programs” effect.
And this reminded me of my experience trying to explain CDS spreads (basically, the price of debt default insurance) of emerging market sovereigns when I worked on Wall Street. I came up with a nice model based on current CDS and economic data and found a few key factors that determine the credit risk. I then set about backtesting my results with CDS data from 1998 and forward. As it turned out, using a giant list of every possible factor to predict future spreads is much more profitable than using a more robust, smaller model! (For those unfamiliar with econometrics, it is not recommended to have 18 explanatory variables to analyze a dataset of 24 countries because the model becomes "overfit" and useless - every spurious difference between two countries predicts a corresponding difference in the future that usually doesn't pan out.)

In this instance, while there is no principled reason to blend so many methods of prediction of CDS spreads, the fact that it worked so much better than the 'sensible' thing made me very curious about the mechanism of price determination in the financial sector. Prices are thought of as magical reflections of the truth (and I certainly don't want to undermine that fundamental trait of free markets - it is incredibly powerful and robust). But the probability of very rare events is hard to know and hard to bet on and hard to verify and adjust. The market price follows the beliefs of the buyers and sellers more than anything else, and those buyers and sellers were trying to make educated guesses using the same tools I was. Even if something in the data is spurious and not relevant to predicting future prices, if someone else sees it and made a decision based on it, it's useful for me to consider it too. So it makes some sense that a needlessly complicated model would be more profitable than an economically sane one.

I wish this is something I could study rigorously, but good luck getting proprietary research methods released from enough major players in even a small niche market as emerging market sovereign insurance. You would have to use a vastly simplified lab setup where you throw useless information at people and see if its perceived importance becomes self-fulfilling. (I think this has been done actually but I'm too lazy to look up the reference.)

And the alternative is that there really is some wisdom of statistical crowds and statistical methods. It certainly looks like there is in the netflix study, because I can't think of a similar self-fulfilling mechanism that might be in effect. (A recommendation system will of course determine what movies people are more likely to view, but not what they will think of them, and none of the proposed systems are actually used yet anyway.) That's pretty darn cool too.