Category: “Theory-testing in Psychology and Physics: A Methodological Paradox”

Philosophy of Science (1967), Vol. 34, 103-15.

Inductive psychology vs deductive physics

Contrast this bizarre state of affairs with the state of affairs in physics. While there are of course a few exceptions, the usual situation in the experimental testing of a physical theory at least involves the prediction of a form of function (with parameters to be fitted); or, more commonly, the prediction of a quantitative magnitude (point-value). Improvements in the accuracy of determining this experimental function-form or point-value, whether by better instrumentation for control and making observations, or by the gathering of a larger number of measurements, has the effect of narrowing the band of tolerance about the theoretically predicted value. What does this mean in terms of the significance-testing model? It means: In physics, that which corresponds, in the logical structure of statistical inference, to the old-fashioned point-null hypothesis H0 is the value which flows as a consequence of the substantive theory T; so that an increase in what the statistician would call “power” or “precision” has the methodological effect of stiffening the experimental test, of setting up a more difficult observational hurdle for the theory T to surmount. Hence, in physics the effect of improving precision or power is that of decreasing the prior probability of a successful experimental outcome if the theory lacks verisimil­itude, that is, precisely the reverse of the situation obtaining in the social sciences.

As techniques of control and measurement improve or the number of observations increases, the methodological effect in physics is that a successful passing of the hurdle will mean a greater increment in corroboration of the substantive theory; whereas in psychology, comparable improvements at the experimental level result in an empirical test which can provide only a progressively weaker corroboration of the substantive theory.

In physics, the substantive theory predicts a point-value, and when physicists employ “significance tests,” their mode of employment is to compare the theoretically predicted value x0 with the observed mean x0, asking whether they differ (in either direction!) by more than the “probable error” of determination of the latter. Hence H : H0 = μx functions as a point-null hypothesis, and the prior (logical, antecedent) probability of its being correct in the absence of theory approximates zero. As the experimental error associated with our determination of x0 shrinks, values of x0 consistent with x0 (and hence, compatible with its implicans T) must lie within a narrow range. In the limit (zero probable error, corresponding to “perfect power” in the significant test) any non-zero difference (x0 – x0) provides a modus tollens refutation of T. If the theory has negligible verisimilitude, the logical probability of its surviving such a test is negligible. Whereas in psychol­ogy, the result of perfect power (i.e., certain detection of any non-zero difference in the predicted direction) is to yield a prior probability p = ½ of getting experimental results compatible with T, because perfect power would mean guaranteed detection of whatever difference exists; and a difference [quasi] always exists, being in the “theoretically expected direc­tion” half the time if our substantive theories were all of negligible verisimilitude (two-urn model). [112-3]

Methodological confirmation bias

Inadequate appreciation of the extreme weakness of the test to which a substantive theory T is subjected by merely pre­dicting a directional statistical difference d > 0 is then compounded by a truly remarkable failure to recognize the logical asymmetry between, on the one hand, (formally invalid) “confirmation” of a theory via affirming the consequent in an argument of form: [T ⊃ H1, H1, infer T], and on the other hand the deductively tight refutation of the theory modus tollens by a falsified prediction, the logical form being: [T ⊃ H1, ~H1, infer ~T].

While my own philosophical predilections are somewhat Popperian, I daresay any reader will agree that no full-fledged Popperian philosophy of science is presupposed in what I have just said. The destruction of a theory modus tollens is, after all, a matter of deductive logic; whereas that the “confirmation” of a theory by its making successful predictions involves a much weaker kind of inference. This much would be conceded by even the most anti-Popperian “inductivist.” The writing of behavior scientists often reads as though they assumed—what it is hard to believe anyone would ex­plicitly assert if challenged—that successful and unsuccessful predictions are practically on all fours in arguing for and against a substantive theory. [112]