no free lunch in statistics

Simon and Tibshirani recently posted a short comment on the Reshef et al MIC data mining paper I blogged about a while back:

The proposal of Reshef et. al. (“MIC“) is an interesting new approachÂ for discovering non-linear dependencies among pairs of measurementsÂ in exploratory data mining. However, it has a potentially serious drawback. The authors laud the fact that MIC has no preference for someÂ alternatives over others, but as the authors know, there is no free lunchÂ in Statistics: tests which strive to have high power against all alternatives can have low power in many important situations.

They then report some simulation results clearly demonstrating that MIC is (very) underpowered relative to Pearson correlation in most situations, and performs even worse relative toÂ SzÃ©kely & Rizzo’s distance correlation (which I hadn’t heard about, but will have to look into now). I mentioned low power as a potential concern in my own post, but figured it would be an issue under relatively specific circumstances (i.e., only for certain kinds of associations in relatively small samples). Simon & Tibshirani’s simulations pretty clearly demonstrate that isn’t so. Which, needless to say, rather dampens the enthusiasm for the MIC statistic.

One thought on “no free lunch in statistics”

no free lunch in statistics

Related

One thought on “no free lunch in statistics”

Leave a ReplyCancel reply

Share this:

Related

One thought on “no free lunch in statistics”

Leave a ReplyCancel reply