back to Chapter 4 | up to contents | on to Chapter 6

CHAPTER FIVE

First Order Logic or The Predicate Calculus

Predication

When we say something of something, it is called 'predication'. For example 'The door is blue', 'The brick is red'. Such statements are dissected by logicians into parts: the subject (that of which something is said) and the predicate (what is said of the subject). In the first example, the subject is 'the door', the predicate 'is blue'. Those of a mathematical turn of mind like to represent such statements by means of symbols; thus 'the door is blue' might become

B(d).

Here the predicate 'is blue' is represented by the letter B with its brackets ( ), while the subject, the door, is represented by the lower-case letter d which has submitted to the embrace of the predicatory brackets. We begin to see the possible of advantage of this symbolic approach if, for some reason, we want to represent the statement 'the window-frame is blue'. This would become

B(w)

- same predicate, different subject. It is similarly straightforward to apply a different predicate to the same subject, e.g. R(d) - the door is red. Now consider the formula

B(d) ∧ R(w)

(where ∧ stands for 'and'; read the formula as: 'B of d and R of w'). This obviously suggests the statement 'the door is blue and the window is red'. Considered as a whole, the statement says something of that collectivity that we call 'the door and the window'. It may be considered as a two-place predication P( , ) where the subject is the pair 'door, window' and where the (two-place) predicate is 'is blue as regards the first member of the pair, red as regards the second'. Thus

P(x,y)

says x is blue, y is red; and we may write

P(x,y) = B(x) ∧ R(y).

Why are the letters x and y not in bold type? That is because they are acting as 'place holders'. x is not representing an actual definite thing: our idea is, rather, that x could be anything you like. (There will be more on this as we go along.)

In fact, the above P(x,y) is a somewhat special example of a two-place predicate in that it decomposes into a predicate said of x and a predicate said of y. Suppose we wanted to say 'the window is to the left of the door', represented by

L(w,d).

This is not a case of saying something about w, something about d, and combining together the two 'things said'. It's saying something about w and d together. To put it slightly differently, we may say that L(x,y) expresses a certain relationship between x and y.

There is nothing to stop one having three-place predicates, and so on. For example,

G(x,y,z)

could be taken to mean 'x gave y to z'. This is certainly not a question of combining together statements about x, y, and z separately. To get the sense, we must hold the three subjects x, y and z in our mind simultaneously and perceive the fact that holds them all together: the fact that x gave y to z.

Truth and Falsity

G(a,b,c)

is an assertion about the definite, particular subjects a, b and c. (For example, a=John, b=the rose, c=Mary.) Suppose it was actually g=Gordon who gave the rose to Mary. In that case we say that the assertion is false. As this is a predication, we might be tempted to represent it by the formula

False(G(a,b,c)).

However, this would not capture a particular feature of the predicates True( ) and False( ) that will prove important, in the kind of logic that we are attempting to codify. Our general assumption will be that the facts of the matter can always be given in sufficient detail, that in the presence of the relevant facts, an assertion is either true or false. So our preferred mode of expression will be

Val(assertion) = T

when the assertion is true (in the light of the facts), and

Val(assertion) = F

when it is untrue. Here Val is short for Value, T for True, F for False. But missing even from this notation is the bit about 'in the light of the facts'. To evaluate a predication we need the facts, the background knowledge. We symbolize all this by the letter M, short for 'Model' - the reason for that choice of word will only emerge later. To represent the fact that M has a bearing (indeed, an all-important bearing) on the value of any assertion, we modify our notation to the following:

Val_M(assertion) = T or F.

If two assertions A and B have the same value in a certain context M then

Val_M(A) = Val_M(B);

for this, it is sometimes convenient to write instead

A =_M B.

If we agree (what should not be too difficult) that the value of T is always T in any context, and the value of F is always F, we may write instead of Val_M(A) = T, for 'A is true', the formula

A =_M T.

This would be read aloud: 'In the context M, the value of A is T'.

Negation

The operation of negation converts a predication into its contradictory opposite. Using the example of G(x,y,z) we may say that

¬G(x,y,z),

pronounced 'Not (G of x, y, z)' means 'it is not the case that x gave y to z'. Thus if it was really Gordon, not John, who gave the rose to Mary, we have

¬G(j, r, m) =_M T.

Connectives

We have already seen that a new predication can be obtained by combining together two simpler predications. The example given combined two predications with the connective 'and' symbolized by ∧. In any context M, if A and B are two assertions, the value of A∧B is T iff (if and only if) both A and B have value T; otherwise the value is F. We define three other connectives in a similar way:

or	A∨B	A∨B =_M T if either A =_M T or B =_M T, or both; the value is F otherwise
implies	A→B	A→B =_M F iff A =_M T and B =_M F, true otherwise
is equivalent to	A↔B	A↔B =_M (A→B) ∧ (B→A); the value is T iff A and B have the same value

General Statements

Here is where things get interesting. Suppose that T(x) says 'x is tall'. Then the formula

∀x T(x) ,

which reads 'for all x, T of x', represents the assertion 'every x is tall'.

But this is really a bit vague. What sort of entities are we talking about? Do we mean to say 'Everyone is tall'? Or do we mean: 'All the buildings are tall'?

For the formula to make sense, we need to know what we are talking about. (Never a bad thing.) More exactly, we need to specify our universe of discourse. In our first interpretation, this was 'people'; in the second, 'buildings' e.g. buildings in this town, buildings in America, etc.

The universe of discourse, U, is a set, whose elements are understood to be all the things that x could stand for when we assert (for example) ∀x T(x). We call this sort of statement a 'universal' statement.

Similar considerations apply to the other ('existential') mode of generalization. The formula

∃x T(x)

says, there exists an x, such that x is tall. Again, the same question arises: where are we allowed to look for our x that is tall? Are we to look among mammals, mountains, Masai warriors? Again, a universe of discourse is needed to tie the assertion down. It is an element of U that we must find, satisfying Val(T(x))=T, in order to verify the assertion ∃x T(x).

Is a general statement (of either type) a predication? If so, what is the subject? It would appear that such a statement is indeed a predication - with the universal set U as subject.

There is nothing to stop us combining general predications with ordinary ones. For example:

James is at home and all the flowers are blooming
H(j) ∧ (∀f B(f))

The universal set here includes, presumably, people (such as James) as well as flowers. But there is a problem. How can we make sure that ∀f refers only to the flowers in U, and not the people? We need a new predicate, F( ), meaning 'is a flower'. A better formulation of our combined statement would be

H(j) ∧ (∀f (F(f)→B(f)))

Formulae

Let us say that an atomic formula is an m-place predicate-function, symbolized by an upper-case letter, together with m terms within its brackets (the 'arguments' of the predicate-function), each term being either a constant (letter in bold type) or a variable (letter in ordinary type). For example:

G(x,y,z).

A formula (in general) is one of the following:

an atomic formula
(¬A), where A is a formula
(A∧B), where A and B are both formulae
(A∨B), where A and B are both formulae
(A→B), where A and B are both formulae
(A↔B), where A and B are both formulae
(∀x E), where E is a formula
(∃x E), where E is a formula

(Usually the variable x will occur somewhere in the formula E, but this is not mandatory.) A formula other than an atomic formula - in other words, a formula that is built up out of atomic formulae - we will term a compound formula. The brackets round each product of an operation (whether binary as in ∧, → or unary as in ¬, ∀x) serve to make the order of operations absolutely clear, when a compound formula is built up out of atomic formulae. They can be dispensed with, largely, if we agree on an order of precedence for the operations. The following convention seems quite natural, and we will adopt it:

¬ takes effect first
∧ and ∨ take effect next
∀x, ∃x take effect next
→, ↔ take effect last

Well-Formed Formulae (Wffs)

The scope of a quantifier, such as ∀x or ∃x, is defined to be the formula on which it acts. Let us say that a formula is well-formed if it contains no variable as an argument of an atomic formula that is not 'announced' by a quantifier within whose scope the atomic formula occurs. For example, if Q(..,t,..) occurs in the formula, it must be in one of two contexts:

∀t ( ... Q(..,t,..) ...)
∃t ( ... Q(..,t,..) ...) .

Suppose that ∀t E is well-formed. If t occurs in E, the formula on which ∀t acts, then E itself is not well-formed in that t occurs in it but is not announced by any quantifier. In this situation we say that t is 'free' in E and it is often convenient to indicate the fact by writing E(t) instead of plain E. Indeed, E(t) may be interpreted as a predication with the 'unknown' t as subject.

If ∀t E is well-formed, but t does not actually occur in E, then E itself is after all well-formed and we may write

∀t E = E ,

meaning that the two sides of the equation have equal values in all models. Clearly, if E is true, then E is true for all t, as t has no influence on E; and likewise if E is false. The same remarks apply to the existential quantifier ∃.

Models

A model M for a well-formed formula (or wff) W is what is required to determine the value of W (T or F). M will consist then of a set U (the universe of discourse), together with the 'facts' about U. Various atomic formulae will occur in W. By the 'facts' we mean the values (T or F) taken by all of these atomic formulae when any elements of U are substituted for the variables occurring in them. For example, suppose H(x,y) and G(s,t,u) occur in W. Then the model must tell us all the values taken by H(x,y) when all possible U-elements are substituted for x and y in turn; and also the values taken by G(s,t,u) when all possible U-elements are substituted for s and u in turn. The model must also give us a reference for the constant t i.e. it must tell us which U-element t refers to (and the same goes for any other constants occurring in W).

Lemma 5.1

In the presence of (or 'against the background of') a model M, any well-formed formula W takes on a definite value, T or F.

Proof

The number of distinct variables occurring in W is the same as the number of quantifiers (after we've eliminated any quantifiers that do nothing, i.e. that announce variables which then never appear as arguments of atomic formulae). Suppose this number is n and suppose further that the Lemma has already been proved for all wffs with fewer than n variables. We now consider two cases:

1) The outermost quantifier (or one of the outermost quantifiers) is ∀x, acting on E(x). Now, the value of ∀x E(x) is T if the value of E(x) is always T, whenever we substitute an element of U for x; otherwise, it is F. So to establish the value of ∀x E(x) we need to look at all the values E(x), where x refers to each of the U-elements in turn. (E(x) is the formula that results from replacing all occurrences of x in E(x) by x.) Now E(x) is not well-formed (because it contains a free variable), but for each choice of x, E(x) is well-formed, and contains fewer than n variables; therefore, by hypothesis, it has a value in the presence of M. If the value is T, for all choices of x, then the value of the quantified formula is T; if, on the contrary, we ever come across an x that makes the value of E(x), F, then the value of the quantified formula is F. Since W is either the quantified formula, or the combination (by means of ∧, ∨ etc.) of that formula, or its negation, with other wffs, containing fewer than n variables, we may deduce that M determines the value of W.

2) The outermost quantifier is ∃y, acting on F(y). Again we consider the various values F(y), where y stands for all the U-elements in turn. If any of the values is T, then the value of the quantified formula is T; if all F, the value of the quantified formula is F. Again, it follows that W itself must have a value determined by M.

Finally, it is clear that the Lemma is true when the number of variables is zero: this is just a matter of combining values of atomic formulae (with no variable arguments, just constant ones) using ¬, ∧, etc. Therefore, by mathematical induction, the Lemma is true for all n◊

Implication

Implicated means 'folded in'. It is a strange and wonderful thing that one assertion may be 'folded in' to another, and discovered by a process of unfolding or evolution. This is the central mystery of Logic, the nugget of gold that is the object of our labours.

Consider, for example, the three wffs

∀x (R(x)→H(x))
R(t)
∃y H(y)

which could be interpreted as

All rich people are happy people
Terry is rich
There is someone who is happy.

You may well be able to intuit (i.e. see directly) that the third statement is implicated in the first two. But how does this really work? You don't have to know very much about happiness, or wealth, or Terry, to see that the third statement follows from the other two. Turning back to the three wffs, is there something in their very structure that makes the third implicated in the first two? Let's try another interpretation:

All rats are hungry
Tom is a rat
∴ There exists a hungry animal.

This seems to work equally well. Could we hazard that any interpretation of the wffs would verify the implication? And what would 'verify' actually mean?

It is easier to think about what an interpretation would have to do to disprove the implication. This would be an interpretation in which the first two statements were true but the third was false. Nothing else (e.g. just one of the first two statements true, the third false) will do. Putting it symbolically, we would have to find a model in which

(∀x (R(x)→H(x))) ∧ R(t) ∧ ¬(∃y H(y)) =_M T.

A model that did this would disprove the implication. Turning everything round, the implication will be verified if we can show that all possible models lead on the contrary to

(∀x (R(x)→H(x))) ∧ R(t) ∧ ¬(∃y H(y)) =_M F .

This has the form

P₁∧P₂∧¬C =_M F ,

where P₁ and P₂ stand for the first two statements (the premises), C for the third (the conclusion). Think what this says: it is never the case that the premises are true without the conclusion being true as well. If we know that the first two statements are true in some model, we can then deduce that the third one must be true as well, without actually evaluating it.

To put the verification-criterion another way:

¬(P₁∧P₂∧¬C) =_M T,

equivalently,

(P₁∧P₂)→C =_M T

for all M. In general, we may write this

(P₁∧P₂∧ ... ∧P_m) → C = T,

where the various P_i stand for the premises (m of them), and it is understood that the equals sign = without a qualifying subscript M means simply 'equivalent (i.e. having equal values) in all models'.

All this gives us no more than the definition of a valid implication, and gives no guidance as to how we might prove that the particular wff (P₁∧P₂∧ ... ∧P_m)→C happens to be true in all models. We will take up the question of proof in the next and subsequent chapters. For now we merely note that proving that a wff is true in all models is no trivial task. If the wff in question is at all complicated, the number of possible models is tremendously large, i.e. infinite several times over. To make any claim about all the possible models for a wff is to stick ones neck out, to say the least of it.

Universal Validity, Contingency and Unsatisfiability

If a wff W is true in all possible models, we say that W is universally valid. If there is no model that makes W true, we say that W is unsatisfiable. An alternative definition of unsatisfiablity: ¬W is universally valid. If neither of these is the case, in other words, if W is sometimes true, sometimes false, depending on the model, then we say W is contingent. This is, of course, by far the commonest case.

back to Chapter 4 | up to contents | on to Chapter 6