Tuesday, August 31, 2010

Units of evidence

We often encounter the question whether a proposition, P, is true or false. The probability that it is true is "p". Various arguments - logical inference - may exist to determine our subjective value of "p". In particular, Bayesian inference multiples the probabilities "p" and "1-p" by the probability that the respective hypotheses give the result that agrees with the newest observation.

Some of the arguments may be K-sigma deviations of the measurements from the prediction of a null hypothesis. The value of "K" may be translated to "p" through the conventional error function: for example, a 3-sigma deviation translates to the 0.3% probability that P is true (99.7% that it is false).

It could be helpful to define another function of "p" or "K", called "AE", that kind of interpolates between "p" and "K". The letters "AE" stand for "amount of evidence". It is a dimensionless quantity but you may still use the term "unit of evidence" or "UE" for the unit. "AE" is defined as
AE = ln(p/(1-p))
AE = ln(1/(1-p)-1)

p = 1/(1+exp(-AE))
For your convenience, I have also written down the formula for "AE=AE(p)" where "p" only appears once, as well as the inverse relationship where "AE" appears once. If "AE" is positive, the evidence supporting the proposition P is stronger than the evidence going in the opposite direction.

I constructed "AE" as a simple function of the odds - the ratio of probabilities of "P" and "non P", i.e. as a simple function of "p/(1-p)". The precise definition of "AE" has the obvious property that if you negate P i.e. if you exchange "p" and "1-p", the value of "AE" simply switches the sign.

Moreover, when you evaluate independent pieces of evidence by Bayesian inference, "AE" simply behaves additively. Note that if the less likely answer is very small, 10^{-n}, then the amount of evidence "AE" is approximately "2.3 n" (with the correct sign - plus means probably yes, minus means probably no) where "2.3" should be the natural logarithm of ten.




If the evidence against P takes the form of a K-sigma deviation of the observations from the null hypothesis, "AE" additively shifts by "K^2/2" in the obvious direction. Moreover, it should always be understood that the amount of evidence is always determined with some inevitable error margin, something like +-1 or +-2. So the amount of evidence whose absolute value is smaller than 1 or 2 should be discarded as "no evidence".

If you want the equivalent of "one in two million" confidence level, i.e. a discovery in the particle physics sense, it's approximately "AE=+14" or "AE=-14" coming from "ln(1 or 2 million)". Because "5^2/2=12.5", you need a slightly more than 5-sigma to achieve "AE=14". But it's OK to give some automatic punishment - for cherry-picking - if your evidence is based on a K-sigma deviation.

The exact formula would contribute slightly more than "K^2/2" to "AE" (by an asymptotically universal additive constant) but because you may have cherry-picked for the "bumps", it's OK to say "K^2/2". Depending on the context, the punishment of "AE" for the possible cherry-picking of your argument could be calculated more accurately.

You may write all the rules-of-thumb that you need to use "AE" efficiently in many sorts of situations. It may be pretty useful if you could just say that "some argument provides us just with 1 or 2 units of evidence" while "another argument is potentially 6 or 7 units of evidence". The value of "AE" could be written with some error margins, too.

See also exponential percentages for a closely related proposed logarithmic terminology.