Evaluating Probabilistic Predictions — David Pennock Edition

Chris F. Masse February 20th, 2007

No Gravatar

David Pennock:

[...] So what is the “right” way to evaluate probabilistic predictions? There is no single absolute best way, though several tests are appropriate, and probably can be considered stronger tests than the calibration test. In our paper “Does Money Matter?” we use four evaluation metrics:

1. Absolute error: The average over many events of lose_PR, the probability assigned to the losing outcome(s)
2. Mean squared error: The square root of the average of (lose_PR)2
3. Quadratic score: The average of 100 - 400*(lose_PR)2
4. Logarithmic score: The average of log(win_PR), where win_PR is the probability assigned to the winning outcome

Note that the absolute value of these metrics is not very meaningful. The metrics are useful only when comparing one predictor against another (e.g., a market against an expert).

My personal favorite (advocated in papers and presentations) is the logarithmic score. [...]

Trackback URI | Comments RSS

Leave a Reply

You must be logged in to post a comment.