Tag: method

How not to be misled by words

To point (4)—induction is a myth—I wish to add only that nothing depends upon words. If anybody should write, as did Peirce, “The operation of testing a hypothesis by experiment … I call induction”, I should not object, as long as he is not misled by the word. But Peirce was misled, as were many others. This is why I prefer to use the word “induction” to stand for the myth that the repetition of something—“observations” or “instances”, perhaps—provides some rational basis for the acceptance of hypotheses. Peirce, in spite of the flawless explanation he sometimes gave of the method of hypotheses and tests, at other times defended precisely this myth; for example, when he compared natural laws with habits (acquired by repetition) and when he tried to give a probabilistic theory of induction. It is induction by repetition (and therefore probabilistic induction) which I combat as the centre of the myth; and in view of the past history of induction from Aristotle and Bacon to Peirce and Carnap, it seems to me appropriate to use the term “induction” as standing, briefly, for “induction by repetition”. [1032]

Scientific methodology (German edition)

3. Die deduktive Überprüfung der Theorien. Die Methode der kritischen Nachprüfung, der Auslese der Theorien, ist nach unserer Auffassung immer die folgende: Aus der vorläufig unbegründeten Antizipation, dem Einfall, der Hypothese, dem theoretischen System, werden auf logisch-deduktivem Weg Folgerungen abgeleitet; diese werden untereinander und mit anderen Sätzen verglichen, indem man feststellt, welche logischen Beziehungen (z. B. Äquivalenz, Ableitbarkeit, Vereinbarkeit, Widerspruch) zwischen ihnen bestehen.

Dabei lassen sich insbesondere vier Richtungen unterscheiden, nach denen die Prüfung durchgeführt wird: der logische Vergleich der Folgerungen untereinander, durch den das System auf seine innere Widerspruchslosigkeit hin zu unter­suchen ist; eine Untersuchung der logischen Form der Theorie mit dem Ziel, festzustellen, ob es den Charakter einer empirisch-wissenschaftlichen Theorie hat, also z. B. nicht tautologisch ist; der Vergleich mit anderen Theorien, um unter anderem festzustellen, ob die zu prüfende Theorie, falls sie sich in den verschiedenen Prüfungen bewähren sollte, als wissenschaftlicher Fortschritt zu bewerten wäre; schließlich die Prüfung durch „empirische Anwendung“ der abgeleiteten Folgerungen.

Diese letzte Prüfung soll feststellen, ob sich das Neue, das die Theorie behauptet, auch praktisch bewährt, etwa in wis­senschaftlichen Experimenten oder in der technisch-praktischen Anwendung. Auch hier ist das Prüfungsverfahren ein deduktives: Aus dem System werden (unter Verwendung bereits anerkannter Sätze) empirisch moglichst leicht nach­prüf­bare bzw. anwendbare singuläre Folgerungen („Prognosen“) deduziert und aus diesen insbesondere jene ausgewählt, die aus bekannten Systemen nicht ableitbar sind, bzw. mit ihnen in Widerspruch stehen. Über diese – und andere – Folgerungen wird nun im Zusammenhang mit der praktischen Anwendung, den Experimenten usw. entschieden. Fällt die Entscheidung positiv aus, werden die singulären Folgerungen anerkannt, verifiziert, so hat das System die Prüfung vorläufig bestanden; wir haben keinen Anlaß, es zu verwerfen. Fällt eine Entscheidung negativ aus, werden Folgerungen falsifiziert, so trifft ihre Falsifikation auch das System, aus dem sie deduziert wurden.

Die positive Entscheidung kann das System immer nur vorläufig stützen; es kann durch spätere negative Entscheidungen immer wieder umgestoßen werden. Solang ein System eingehenden und strengen deduktiven Nachprüfungen standhält und durch die fortschreitende Entwicklung der Wissenschaft nicht überholt wird, sagen wir, daß es sich bewährt.

Induktionslogische Elemente treten in dem hier skizzierten Verfahren nicht auf; niemals schließen wir von der Geltung der singulären Satze auf die der Theorien. Auch durch ihre verifizierten Folgerungen können Theorien niemals als „wahr“ oder auch nur als „wahrscheinlich“ erwiesen werden.

Open to suspicion

In preparing this table [a variation of Elderton’s Table of Goodness of Fit] we have borne in mind that in practice we do not want to know the exact value of P for any observed χ², but, in the first place, whether or not the observed value is open to suspicion. If P is between ·1 and ·9 there is certainly no reason to suspect the hypothesis tested. If it is below ·02 it is strongly indicated that the hypothesis fails to account for the whole of the facts. We shall not often be astray if we draw a conventional line at ·05, and consider that higher values of χ² indicate a real discrepancy. [80, 11th ed.]

In preparing this table [a variation of Elderton’s Table of Goodness of Fit] we have borne in mind that in practice we do not want to know the exact value of P for any observed χ², but, in the first place, whether or not the observed value is open to suspicion. If P is between ·1 and ·9 there is certainly no reason to suspect the hypothesis tested. If it is below ·02 it is strongly indicated that the hypothesis fails to account for the whole of the facts. Belief in the hypothesis as an accurate representation of the population sampled is confronted by the logical disjuction: Either the hypothesis is untrue, or the value χ² has attained by chance an exceptionally high value. The actual value of P obtainable from the table by interpolation indicates the strength of the evidence against the hypothesis. A value of χ² exceeding the 5 per cent. point is seldom to be disregarded. [80, 14th ed.]

Vague induction

It is clear that, if one uses the word “induction” widely and vaguely enough, any tentative acceptance of the result of any investigation can be called “induction”. In that sense, but (I must emphasize) in no other, Professor Putnam is quite right to detect an “inductivist quaver” in one of the passages he quotes (section 3). But in general he has not read, or if read not understood, what I have written … . [994]

Infinite learning

Thus every statement (or ‘basic statement’) remains essentially conjectural; but it is a conjecture which can be easily tested. These tests, in their turn, involve new conjectural and testable statements, and so on, ad infinitum; and should we try to establish anything with our tests, we should be involved in an infinite regress. But as I explained in my Logic of Scientific Discovery (especially section 29), we do not establish anything by this procedure: we do not wish to ‘justify’ the ‘acceptance’ of anything, we only test our theories critically, in order to see whether or not we can bring a case against them. [521]

Severely risky

A serious empirical test always consists in the attempt to find a refutation, a counterexample. In the search for a counterexample, we have to use our background knowledge; for we always try to refute first the most risky predictions, the ‘most unlikely … consequences’ (as Peirce already saw); which means that we always look in the most probable kinds of places for the most probable kinds of counterexamples—most probable in the sense that we should expect to find them in the light of our background knowledge. Now if a theory stands up to many such tests, then, owing to the incorporation of the results of our tests into our background knowledge, there may be, after a time, no places left where (in the light of our new background knowledge) counter examples can with a high probability be expected to occur. But this means that the degree of severity of our test declines. This is also the reason why an often repeated test will no longer be considered as significant or as severe: there is something like a law of diminishing returns from repeated tests (as opposed to tests which, in the light of our background knowledge, are of a new kind, and which therefore may still be felt to be significant). These are facts which are inherent in the knowledge-situation; and they have often been described—especially by John Maynard Keynes and by Ernest Nagel—as difficult to explain by an inductivist theory of science. But for us it is all very easy. And we can even explain, by a similar analysis of the knowledge-situation, why the empirical character of a very successful theory always grows stale, after a time. We may then feel (as Poincaré did with respect to Newton’s theory) that the theory is nothing but a set of implicit definitions or conventions—until we progress again and, by refuting it, incidentally re-establish its lost empirical character. (De mortuis nil nisi bene: once a theory is refuted, its empirical character is secure and shines without blemish.) [325-6]

The real Popper

It is worth noting that even in Lakatos’s own “methodology of scientific research programmes” (“MSRP”)—a type of sophisticated methodological falsificationism that Lakatos presents as the crowning synthesis of the “thesis” dogmatic falsificationism and the “antithesis” naive methodological falsificationism—the test statements and interpretative theo­ries still are accepted on the basis of a research program. So Lakatos gives a conventionalist solution to the problem of how basic statements are selected, in his interpretation of Popper’s methodology and in his own methodology as well.

This interpretation of Popper is not correct, and the suggested conventionalist solution to the problem of how test state­ments are accepted is not satisfying. Popper’s criticist solution, which Lakatos has not correctly understood, is much better and is also a solution that allows us to understand the history of science better than Lakatos’s oversophisticated combination of conventionalism and falsificationism. Lakatos maintains that sophisticated methodological falsifica­tionism combines the best elements of voluntarism, pragmatism, and the realist theories of empirical growth. Critical falsificationism is better still, among other reasons because it avoids that kind of eclecticism. And for those interested in the history of ideas, it might be worthwhile to know that the real Popper is neither a dogmatic falsificationist nor a naive or sophisticated methodological falsificationist. Not only Popper0 but also Popper1 and Popper2 are myths created by a misunderstanding of Popper’s critical falsificationism.[53]

Falsification as conditional disproof

Kuhn asked what falsification is, if not conclusive disproof. The answer is that falsification is a conditional disproof, conditional on the truth of the used test statements (and in some cases also on the truth of some used auxiliary hypotheses). Feyerabend’s example of the alleged falsification of the Copernican system with naked-eye observations shows this conditional character of falsifications quite well.

Does this cause any logical or methodological problems? The logical situation is quite clear and unproblematic. The methodological situation is only problematic for those who assume that there are infallible test statements. But as Kuhn said, Popper stresses that test statements are fallible. [56]

The myth of naive falsificationism

Naive falsificationism is a myth created by positivist and conventionalist misunderstandings of Popper’s methodology. In the contemporary methodological discussion it is time to end the discussion of the straw man of naive falsificationism in its different positivist and conventionalist variants. It is time to come back to reality and to begin a discussion of real and critical falsificationism. [62]

The misuse of significance tests

The examples elaborated in the foregoing sections of numerical discrepancies arising from tbe rigid formulation of a rule, which at first acquaintance it seemed natural to apply to all tests of significance, constitute only one aspect of the deep-seated difference in point of view which arises when Tests of Significance are reinterpreted on the analogy of Acceptance Decisions. It is indeed not only numerically erroneous conclusions, serious as these are, that are to be feared from an uncritical acceptance of this analogy.

An important difference is that Decisions are final, while the state of opinion derived from a test of significance is provisional, and capable, not only of confirmation, but of revision. An acceptance procedure is devised for a whole class of cases. No particular thought is given to each case as it arises, nor is the tester’s capacity for learning exercised. A test of significance on the other hand is intended to aid the process of learning by observational experience.[100]