Evaluating Probabilistic Predictions — David Pennock Edition

David Pennock:

[...] So what is the “right” way to evaluate probabilistic predictions? There is no single absolute best way, though several tests are appropriate, and probably can be considered stronger tests than the calibration test. In our paper “Does Money Matter?” we use four evaluation metrics:

1. Absolute error: The average over many events of lose_PR, the probability assigned to the losing outcome(s)
2. Mean squared error: The square root of the average of (lose_PR)2
3. Quadratic score: The average of 100 – 400*(lose_PR)2
4. Logarithmic score: The average of log(win_PR), where win_PR is the probability assigned to the winning outcome

Note that the absolute value of these metrics is not very meaningful. The metrics are useful only when comparing one predictor against another (e.g., a market against an expert).

My personal favorite (advocated in papers and presentations) is the logarithmic score. [...]

About Chris F. Masse

Founder and President of Midas Oracle
This entry was posted in All Best Posts Ever, Analysis (Accuracy & Precision), Analysis (Meta) and tagged . Bookmark the permalink.

Leave a Reply