Theory of nearly everything and Keep on looking bias
In his newest blog entry, Sean Carroll is surprised why so little people are publicly amazed by the fact that we have already found a theory of nearly everything (TONE) - by which we mean something like a pragmatic combination of the Standard Model and General Relativity, together with the derivations of useful and conceptually simpler approximations of these theories.
I completely agree with him.
It's amazing that after just a few centuries of the serious and semi-systematic search for the laws of Nature, the mankind knows the fundamental rules that govern all the processes - and the structure of all objects - we see in the world around us. That includes mechanics, electromagnetism, chemistry, biology, radioactivity, and many other categories of phenomena. If you want to see an effect such that science has no way to calculate what will happen (the right probabilities), you really have to build 28-km tunnels with the best state-of-the-art superconducting magnets, or look for the rarest radioactive processes in the deepest mines, or look at some extremely distant places of the Universe by the best telescopes.
In other words, you have to look for objects and effects that are "almost" as exotic, distant, rare, and impractical as the effects of string theory. It's no coincidence that it's hard to test the cutting-edge theories in physics. The reason is simple: Whatever surrounds us has been understood, at least in principle. Needless to say, this is both great as well as disappointing news, depending on your intents.
And what about my answer?
I think that the reason why this fantastic fact is almost never "loudly" appreciated is that the huge hierarchy of knowledge in physics de facto divides almost all the people into two rather sharply defined groups. One group has learned modern physics and finds the statement above trivial; the other group hasn't learned modern physics and isn't aware of the fact - or doesn't "believe" it, as they would self-confidently say.
All of us encounter the laymen all the time. But it's not just the classical laymen. It's many people who have been obsessed by physics throughout their lives - and they may be 70 years old. Most amateur physicists - and I know very many, indeed - still can't comprehend that physics has understood all the questions that they usually ask. They see dragons behind every other corner.
They think that the sign of the antimatter's mass is a complete mystery. They think that the orbits of XY are not understood. They think that if you replace some piece of material in a gravitational field by something else, it's completely unclear what the light will do. And so on, and so on. Although the feeling of mystery may be interesting, romantic, and thrilling, the current science is simply elsewhere. It can answer all such questions.
Obviously, I am talking about a kind of "fundamental physics" questions only - questions that may be difficult because of extreme conditions but that are not difficult because of a stunning complexity of the arrangement of the building blocks. The latter may remain and sometimes do remain difficult today. But it's true that even many of those "seemingly difficult" questions may be solved by many other tools that are believed to be non-existent by most people.
Tommaso Dorigo has re-discovered the "keep on looking bias", also known as the "sampling to a foregone conclusion" or "the problem of repeated significance tests". If someone wants to find a signal which will serve as evidence supporting a conjecture, usually because he believes that the conjecture is true, he will keep on looking, and soon or later, he inevitably has to find the evidence.
That's the actual reason why most of the published 3-sigma claims in literature are wrong.
There's an easy way how to compensate this "keep on looking bias" - assuming that the scientists are honest enough to admit what they have been doing.
If a scientist claims that his signal could only appear as a statistical fluke with probability "p", which is a number much smaller than one (e.g. 0.3% for 3-sigma signals), he should actually multiply this "p" by "N" which is the effective number of inequivalent attempts to find another signal that would have led to (nearly) the same conclusion in the past in which the signal didn't appear.
It's not hard to see where the formula "Np" comes from. If a (fake) positive signal occurs in one box with a small probability "p", the probability that it appears at least in one of "N" boxes is approximately equal to "Np" (assuming that even the latter is sufficiently smaller than one).
What does it imply?
For example, a scientist finds a 3-sigma signal that confirms a conjecture - e.g. that the Himalayan glaciers will melt by 2035. That corresponds to a 99.7% certainty - or a small, 0.3% probability that such evidence occurred by chance (the probability of a fake positive).
However, the person has made 149 similar attempts to find a 3-sigma "proof" of dying Himalayan glaciers in the past - which he can reveal or hide. In total, the person has looked for a "proof" of the same thing at 150 independent places. These 149 previous attempts have failed and only the 150th one succeeded. It follows that the real probability that the 3-sigma signal has occurred somewhere by chance is approximately equal to 150 times 0.3% which is close to 50%.
Clearly, because of the repetition - because of the "keep on looking bias", the signal that he has found at the very end has no confidence level that would be worth talking about. The "amount of evidence" is nearly zero.
This simple mathematical argument has many consequences. For example, most published 3-sigma results are wrong - because they result from similar "repetitive searches". If you guess a reasonable estimate for "N", you may quantify the percentage of 3-sigma papers that are wrong. Also, it follows that approximately 99% of the existing literature on climate science or healthy food is pure junk.
It would be good if the confidence levels were adjusted for the "keep on looking bias" but I am not sure whether the actual scientists in the real world are honest enough to report their actual value of "N" for a given result they want to defend.
In other words, some scientific papers may look convincing to an external reader but the external reader is usually unable to quantify the amount of cherry-picking that was done or had to be done in order to achieve the given result. There are fixes to improve this systematic problem - but the main problem is that many authors don't really want this problem to be fixed.