Contrast this bizarre state of affairs with the state of affairs in physics. While there are of course a few exceptions, the usual situation in the experimental testing of a physical theory at least involves the prediction of a form of function (with parameters to be fitted); or, more commonly, the prediction of a quantitative magnitude (point-value). Improvements in the accuracy of determining this experimental function-form or point-value, whether by better instrumentation for control and making observations, or by the gathering of a larger number of measurements, has the effect of narrowing the band of tolerance about the theoretically predicted value. What does this mean in terms of the significance-testing model? It means: In physics, that which corresponds, in the logical structure of statistical inference, to the old-fashioned point-null hypothesis H0 is the value which flows as a consequence of the substantive theory T; so that an increase in what the statistician would call “power” or “precision” has the methodological effect of stiffening the experimental test, of setting up a more difficult observational hurdle for the theory T to surmount. Hence, in physics the effect of improving precision or power is that of decreasing the prior probability of a successful experimental outcome if the theory lacks verisimilitude, that is, precisely the reverse of the situation obtaining in the social sciences.
As techniques of control and measurement improve or the number of observations increases, the methodological effect in physics is that a successful passing of the hurdle will mean a greater increment in corroboration of the substantive theory; whereas in psychology, comparable improvements at the experimental level result in an empirical test which can provide only a progressively weaker corroboration of the substantive theory.
In physics, the substantive theory predicts a point-value, and when physicists employ “significance tests,” their mode of employment is to compare the theoretically predicted value x0 with the observed mean x0, asking whether they differ (in either direction!) by more than the “probable error” of determination of the latter. Hence H : H0 = μx functions as a point-null hypothesis, and the prior (logical, antecedent) probability of its being correct in the absence of theory approximates zero. As the experimental error associated with our determination of x0 shrinks, values of x0 consistent with x0 (and hence, compatible with its implicans T) must lie within a narrow range. In the limit (zero probable error, corresponding to “perfect power” in the significant test) any non-zero difference (x0 – x0) provides a modus tollens refutation of T. If the theory has negligible verisimilitude, the logical probability of its surviving such a test is negligible. Whereas in psychology, the result of perfect power (i.e., certain detection of any non-zero difference in the predicted direction) is to yield a prior probability p = ½ of getting experimental results compatible with T, because perfect power would mean guaranteed detection of whatever difference exists; and a difference [quasi] always exists, being in the “theoretically expected direction” half the time if our substantive theories were all of negligible verisimilitude (two-urn model). [112-3]