The Validity of Inference
Suppose we are given three facts about the world:
Are we entitled to infer from what we are told that
It is quite easy to see, using 'natural language', that we are indeed entitled to make this inference. For cats with four legs are elegant and therefore do not like bones, and therefore are not dogs. The other cats are not dogs either, simply because they lack four legs.
Our aim in this section is to see that we may settle the question of the validity of an inference, if we choose, not by thinking, but by 'calculating' with circles and letters.
Note that the given facts or premises may be expressed as statements about sets of animals. Thus if we let
the premises are
and the conclusion is
(Here we have used some standard notation of set-theory: ⊂ means 'is a subset of', ∩ means 'intersected with', ∼ means 'the complement of', ∅ means 'the empty set', and ∪ (not used so far) stands for 'united with'.) How may we translate all this into the algebra of the mark? Let's build it up in easy stages. Recall that the subset relation (e.g. 'a is a subset of b') may be represented in algebra by the following:
a(b) = O ,
(a(b)) = _ .
The first equation may be read as: the intersection of a with not-b is the empty set. The second, as: the union of not-a with b is U.
Let F(a,b,c...) be an expression involving the letters a,b,c, etc.. Then the equation
F(a,b,c... ) = _
may be interpreted as a statement expressing a constraint on the sets a, b, ... In the example above, the constraint was simply 'a is a subset of b'. If, on the other hand, F was the expression (ab), then the corresponding statement, or constraint, would be: 'the intersection of a with b is empty', or, 'a and b are disjoint sets'.
Inference is a relation between statements. For example, if S1 and S2 are the statements
then from the two statements together we may infer a third, namely
This is clear when we think concretely about sets (possibly using a Venn diagram, for example). How do we make the inference 'algebraically'? The statement S3 obviously corresponds to the equation
We seek to deduce E3 from
To see how this may be done, we need a couple of lemmas.
Lemma A: (a(bc)) = (a(b)) (a(c))
Proof (using distribute): (a(bc)) = (a( ((b)) ((c)) ) = (( (a(b)) (a(c)) )) = (a(b)) (a(c)). Q.E.D.
Lemma B: (a(b))(b(c)) = (a(b)) (a(c)) (b(c)).
Proof (using copy,
uncopy and Lemma A):
Now, given the general truth expressed by Lemma B
(a(b))(b(c)) = (a(b)) (a(c)) (b(c)) ,
(which could be translated into a set-theoretical statement that was true of all sets a, b and c) and given also that in a particular application of the algebra to a set-problem, the sets denoted by a, b and c are such that (a(b)) = _ and (b(c)) = _ , by substituting these values into the equation we obtain
_ = (a(c)) .
Thus if E1 and E2 hold in a particular application, so does E3.
It turns out that there is a very simple test for whether a set of premises, expressed as
does or does not entail a conclusion
The test is this. Form the expression
Z = FG...K (L) .
If Z reduces to an empty mark O then the inference is valid; otherwise, not. (In other words, the inference is valid if and only if the expression FG...K 'dominates' L, using the word in the sense introduced in the chapter on the Mutual Dominance Lemma.)
The first part of this statement is easy to prove. Given that Z = O and that F, G... up to K are all equivalent to _ , we may deduce
(L) = O
L = _ .
For the second part, we need to think a bit harder about what we mean by a 'valid inference'. I suggest the following interpretation. Let U be any set. There are many ways of assigning the elements of U to the subsets represented by the letters a, b, c and so on. Let's refer to each way of choosing U and the subsets a,b,c etc., as a 'data-set'. Given a data-set, the premise
F(a,b...) = _
may or may not be satisfied. (For example, in case F(a,b)=(a(b)), a may or may not be a subset of b). Now, I say that our inference is valid if and only if whenever a data-set satisfies all the premises, it also satisfies the conclusion. Or, to rephrase the condition, there is no data-set such that the premises are satisfied but not the conclusion.
Now, suppose that the expression Z (defined above) is not equivalent to the empty mark, but to some other expression involving the letters a, b, c... (we must include the case when Z reduces simply to _ ). Think of the letters, for a moment, as variables taking the value _ or O. For each choice of values, Z becomes an arithmetic expression which reduces to _ or O. If Z always reduced to O, whatever values were substituted for the letters, we would have to conclude (thanks to the Test Theorem) that Z was after all equivalent to O. Therefore, by hypothesis, there must be a choice of values such that Z reduces to _ .
Next, we use this choice of values to determine a special data-set. Let U be any non-empty set. If, to make Z = _ , we substituted _ for a, then let the subset a be the universe U; if we substituted O, then let a be the empty set. Similarly for b, c and all the other letters. If some letter is not involved in Z, then we may arbitrarily choose whether to equate the corresponding subset with U or the empty set. (We note that if Z was actually equivalent to _ then this applies to all the letters.)
Now, we claim that this special data-set is one that satisfies all the premises but not the conclusion. Recall the definition of Z:
Z = FG...K (L) .
When the particular set of values _ or O is substituted for the letters, Z reduces to _ , which implies that each of F, G ... must reduce to _ and L must reduce to O. But when the data-set is the special one described above, we are entitled to make just this substitution. Thus we have a counterexample to the inference: a data-set which satisfies all the premises, but not the conclusion.
The time has come to apply our discoveries to our original problem, concerning the cats and dogs. We may translate the premises, and the conclusion, into algebraic form as follows:
We form the expression Z:
Z = (d(bf)) (eb) (fc(e)) cd .
Now we apply our usual algebraic techniques. Uncopying c and d (and, subsequently, b, f and e) we get
Z = ((bf)) (eb) (f(e)) c d = b f (eb) (f(e)) c d = b f (e) ((e)) c d = b f O e c d = O.
Our feeling that the inference was justified is confirmed! Cats and dogs are indeed distinct.
Suppose we had the idea that from the premises we could infer that cats do not like bones; so our conclusion was
instead of C1. Then the expression Z would become
Z = (d(bf)) (eb) (fc(e)) c b = (d(f)) (e) (f(e)) b c = (d) (e) (f) b c,
which does not equate with O. Suppose our universe consists entirely of bone-loving, 3-legged cats which are not dogs and are not elegant. Then b and c both equate with U, while d, e and f are empty; and the expression Z reduces to _ . It can easily be checked that the premises are satisfied; while the conclusion, obviously, is not. The inference is therefore invalid.
Not only has the algebraic method shown that the inference is invalid; it has even provided a counterexample. It is almost as if the circles and letters can think - or even teach us how to think! This seems as good a place as any to break off our account of the remarkable markable mark; while recognizing that there are many directions in which the account could be extended.
Appendix to this chapter
1) Suppose one of the premises is impossible to satisfy; it might be, for example
a(a) = _ ,
or else an equation with some more complicated avatar of O on the left-hand side. Then there is no data-set which can satisfy the premises. Let L = _ be some arbitrarily concocted conclusion; then there is no data-set that can satisfy the premises, but not the conclusion. Therefore, the inference from the premises to the conclusion is valid! From one piece of nonsense, one may infer any number of other pieces of nonsense.
2) The following considerations shed some new light on the inference process. We start with another lemma.
Lemma: (a(b)) = _ if and only if there exists an expression c such that a is equivalent to bc.
Proof. Suppose (a(b)) = _ ; then a = a (a(b)) = a ((b)) = ab. Let c = a; then a = bc. We are halfway there. Suppose conversely that a = bc; then (a(b)) = (bc(b)) = (bc( )) = _ . QED.
We may now apply the lemma to our test of validity of an inference, which is
FG..K (L) = O,
(FG...K(L)) = _ .
Equivalently again (using the lemma), there exists an M such that
FG...K = LM.
If we call FG...K the 'premise expression', this says that the premise expression may be transformed into another expression, one of whose factors is the 'conclusion expression' — in this case, L. In other words, by a process of algebraic transformation, the conclusion expression may be 'brought out of' — or 'deduced from' — the premise expression.
A simple but fundamental example of this process is furnished by Lemma B (above). The lemma tells us that the premise expression
may be transformed into
(a(b)) (b(c)) (a(c)).
The conclusion expression (a(c)) has magically 'emerged' from the premise expression.
We can do the same for the cat-and-dog problem; this boils down to two applications of Lemma B. The premise expression is
(d(bf)) (eb) (cf(e)) .
The last two factors give birth to (cfb) as an extra factor (by Lemma B); then the pair
gives birth to (dc). Thus the premise expression has been transformed into
(d(bf)) (eb) (cf(e)) (bfc) (cd)
and we have the conclusion (no cats are dogs) that we were after.