Suppose we want to express a statement like
“there is a location which has two neighbors”
(which is true, at least for the domain of WaterWorld board
locations), or
“all actors have co-starred with
Kevin Bacon”
(which isn't true, at least for the domain of all Hollywood actors).
As it stands, we can only formulate these in an awkward way —
by talking about
specific (constant) locations like AA and GG,
or specific actors like
Ewan McGregor and
Cameron Diaz.
To talk about all locations, or actors, we're forced to
make huge formulas such as
nhbr(Z,Y)∧¬nhbr(Z,A)∧¬nhbr(Z,B)∧…∧¬nhbr(Z,X)
nhbr(Z,Y)
nhbr(Z,A)
nhbr(Z,B)
…
nhbr(Z,X)
,
just to express
“there is a location which has only one neighbor”.
We'll redress this by introducing two quantifiers,
“∃∃” (“there exists”) and
“∀∀” (“for all”).
For example, “all actors have co-starred with Kevin Bacon”
will be written
∀a.
coStarredWith(a,Kevin Bacon)∀a.coStarredWith(a,Kevin Bacon)
.
For “there is a location which has (at least) two neighbors”,
we'll start with “there exists a location xx …”,
written “∃x.∃x.…”.
“For all” is really just an abbreviation for a
large conjunction, while “exists” is a disjunction
(it could also be called “for some”,
though it's not).
How large a conjunction/disjunction?
As big as your domain — which actually could be very
small, or it could be infinitely large.
Even aside from the fact that we can't write down an infinitely large
conjunction or disjunction,
quantifiers let us form the conjunction without having to
select a domain in advance.
To continue with our WaterWorld example,
how can we express the concept "xx has (at least) two neighbors"?
Well, we'll rephrase this as,
“there exist distinct locations, yy and zz, which each of which
is a neighbor of xx”,
written
∃x.
∃y.
∃z.
y≠z∧nhbr(x,y)∧nhbr(x,z)∃x.∃y.∃z.
y z
nhbr(x,y)
nhbr(x,z)
.
We need the condition
¬y=z y z
in that formula to ensure that we have distinct locations.
Compare to the algebraic equation
x+y=4
x
y
4
in which one possible solution is
x=y=2
x
y
2
.
Variables act the same way in both logic and algebra:
different variables can happen to take on the same value.
We use quantifiers all the time in natural language.
Consider the following examples, where we provide a natural
English wording together with an equivalent phrasing that
makes the quantification more explicit.
We'll take the translations with a grain of salt, since
sometimes people can disagree on the exact details of
the intended English meaning.
Such ambiguity can sometimes be a rich source of creativity,
but it's not tolerable when documenting safety properties of software.
While some of these examples are a bit frivolous,
in general quantifiers let us precisely capture
more interesting concepts in type-checking,
data structures such as trees and hash tables,
circuit specifications, etc.
Quantification in English
| Natural English |
Formalized English |
| “If you don't love yourself, you can't love anybody else.”
|
“If you don't love you, there does not exists a person yy,
such that you love yy.”
|
| “N*Sync is the best band ever!”
|
“For all bands xx, N*Sync is better than band xx
(or, x=N*Syncx N*Sync).”
A quick listen can easily show this statement false.
|
| A casually subtle line from Something About Mary:
“Every day is better than the next.”
|
“For all days xx, xx is better than next(xx).”
|
| A buggy line from a song (Everybody Loves My Baby,
Jack Palmer & Spencer Willson, 1924):
“Everybody loves my baby; My baby don't love [anybody] but me.”
|
“For all persons xx, xx loves my baby.
For all persons yy, if my baby loves yy, then yy is me.”
If true, one can conclude
the speaker is his own baby, and is narcissistic.
|
| “Every neighbor of x is unsafe.”
|
“For all locations yy,
if yy is a neighbor of xx, then yy is unsafe.”
|
| “There is a safe location that is a neighbor of x,
if num(x)<3.”
|
“If num(x)<3,
then there exists a location y,
such that y is safe, and y is a neighbor of x.”
|
| “If you've seen one episode, you've seen 'em all.” |
“If there exists one episode xx such that you've seen xx,
then for all episodes zz, you've seen zz.”
|
| “Somebody loves everybody.”
|
“There exists some person yy, such that
for all persons xx, yy loves xx.”
|
| “There is someone for everybody.”
|
“For all persons xx, there exists a person yy,
such that yy is for xx.”
|
|
“All's well that ends well.”
|
“For all events xx, if xx ends well then xx is well.”
|
The ambiguous “any”:
I was playing a game with some friends, and we came across the rule:
“If you have more cards than any other player,
then discard a card.”
Does this mean
“than all other players”, or
“than some other player”?
Our group's opinion was divided (incl. across many native English speakers).
In our class terms, it's not always clear
whether “any” means for-all,
or for-some (there-exists).
Or maybe more accurately,
in the phrase “for any xx”,
does xx necessarily mean an
arbitrary
player?
Linguistics students, or those who are so sure the rule clearly
intended “than all other players”:
Switching
“
(x>y)
( x y)”
to
“
(x<y)
( x y)”
changes from an active voice to
a passive voice
but may also reverse your interpretation of the English
quantifier “any”:
“If any player has fewer points than you, …”
In your proof-writing (and your English writing, and your informal writing),
think about replacing “any” with
either “every” or
with “some”, to make
your meaning clear.
We originally defined a well-formed formula (WFF) for
propositional logic; we'll extend this to WFFs for
first-order logic, also known as
predicate logic.
At the same time, we'll more precisely define the binding of variables.
This logic allows use of both functions and relations.
Since these functions' outputs are not Booleans (otherwise, we'd call
them relations), but rather data than can be used as a relation's input,
we separate the syntax into that of terms and formulas.
Terms are all the possible inputs for a relation.
- Definition 1:
term
1.
A variable.
2.
A constant.
WaterWorld location FF, Kevin Bacon, or the number 3.
3.
A function applied to one or more terms.
successor(3)successor(3)
- Definition 2:
Well-Formed Formula (WFF) for first-order logic
1.
A constant: true or false.
2.
An atomic formula: a relation symbol applied to one
or more terms.
3.
A negation of a WFF,
¬φ φ .
4.
A conjunction of WFFs,
φ∧ψ φ ψ .
5.
A disjunction of WFFs,
φ∨ψ φ ψ .
6.
An implication of WFFs,
φ⇒ψ φ ψ .
7.
A universal quantification of a WFF,
∀x. φ∀x.φ .
∀x. nhbr(x,F)∀x.nhbr(x,F)
8.
An existential quantification of a WFF,
∃x. φ∃x.φ .
∃x. nhbr(x,F)∃x.nhbr(x,F)
While a formula is just a piece of syntax,
the meaning of its connectives, including the quantifiers,
is part of the definition of a WFF.
However, as previously discussed,
the meaning of a WFF also depends on the
interpretation
we give to its relations.
Everybody likes John Cusack:
∀x.
likes(x,John Cusack)∀x.likes(x,John Cusack)
.
Somebody likes Joan Cusack:
∃x.
likes(x,Joan Cusack)∃x.likes(x,Joan Cusack)
.
Somebody likes everybody:
∃x.
∀y.
likes(x,y)∃x.∀y.likes(x,y)
.
(We use nn for “needy”?)
Everybody likes somebody:
∀y.
∃x.
likes(y,x)∀y.∃x.likes(y,x)
.
Careful; this formula looks similar to the preceding one, but it has
a very different meaning!
How would you express
“Somebody is liked by everybody”?
The cue
“Somebody …”
suggests one person who exists; we'll call them pp for
“popular”:
∃p.∃p.….
Now we need to fill in the dots with
“everybody likes pp”, to get:
∃p.∀x.
likes(x,p)∃p.∀x.likes(x,p)
.
How would you express
“Everybody is liked by somebody”?
The cue
“Everybody …”
suggests a universal; we'll call them jj for
“J. Doe”:
∀j.∀j.….
Now we need to fill in the dots with
“somebody likes jj”, to get:
∀j.
∃x.
likes(x,j)∀j.∃x.likes(x,j)
.
Note that this formula is just like the preceding
“Somebody likes everybody” example,
except that the quantifiers have been swapped
(and different variable names were used, a superficial difference).
The following formula is a simple application of symmetry.
∀∧x∧.∧∀∧y∧.∧near(x,y)⇒near(y,x)∧near(Sue,Joe)⇒near(Joe,Sue)
∀x.
∀y.
near(x,y)
near(y,x)
near(Sue,Joe)
near(Joe,Sue)
.
While it is certainly true under the intended interpretation,
it is also true under any interpretation.
Such formulas are called valid.
Valid first-order formulas are the natural analog of tautological
propositional formulas.
∀x.
even(x)∧prime(x)⇒x=2∀x.
even(x)
prime(x)
x 2
is a mathematical fact, in the standard interpretation of arithmetic.
While technically not allowed by our
term and
formula
syntax, we'll continue using infix notation
for common mathematical functions and relations, as in
the previous example.
The
previous example used the relations
eveneven and primeprime.
Of course, to use such relations,
they must either be defined directly by the interpretation, or
be defined in terms of functions and relations provided by
the interpretation.
How would you define these two relations in terms of the basic
numerical functions (addition, multiplication, …)
and relations (== , <, >)?
“Evenness” is a straightforward translation from
“An integer n is even, iff it is twice some other integer k”:
even(n)≡∃≡k≡.≡n=2k
even(n)
∃k.
n
2k
.
Note that by this standard definition, zero is even.
There are many equivalent ways to define primality, just as there
many algorithms for checking primality.
One straightforward solution is
noncomposite(n)≡∀≡j≡.≡∀≡k≡.≡jk=n⇒j=1∨k=1
noncomposite(n)
∀j.
∀k.
j k
n
j 1
k 1
.
Well, this is almost expresses “prime”,
except that
n=1
n
1
satisfies this formula.
A mathematician points out that just as 0 is neither positive nor
negative, 1 is neither prime nor composite;
as stated this formula
actually captures “noncomposite”, oops.
There are several ways to upgrade this to exactly capture
“prime”.
1 is called a “unit”.
If we consider the domain of all integers (not just natural numbers),
the idea of primality still makes sense; -17 is also prime;
and -1 is also another unit. Similarly, considering the domain
of “complex integers”
{a+bⅈ|a∈ℤ∧b∈ℤ}
a
b
a
b
(could be written
“
ℤ+ℤⅈ
”),
then
ⅈ
and
-ⅈ
are also units.
How might we generalize our definition of prime, to work
in these further interpretations?
A similar, equivalent formula to the above is
noncomposite(n)≡¬∃
noncomposite(n)
∃j.
∃k.
j k
n
j 1
k 1
.
One hypothesis about natural numbers is known as
Goldbach's Conjecture.
It states that all even integers greater than two
can be expressed as the sum of two primes.
It is one of the oldest still-unsolved problems about numbers.
How would you write this conjecture as a WFF?
∀n.
even(n)∧n>2⇒∃∀n.
even(n)
n 2
∃p.
∃q.
prime(p)
prime(q)
p q
n
Enough about number theory. Let's look at some examples
about common data structures and some about our favorite problem,
WaterWorld.
If your program uses binary search trees and your domain is
tree nodes, you need to know
∀node.
dataleftnode≤datanode∧datarightnode>datanode∀node.
data
left
node
data
node
data
right
node
data
node
.
If these trees are also balanced, you need to know
∀node.
heightleftnode=heightrightnode∨heightleftnode+1=heightrightnode∨heightleftnode=heightrightnode+1∀node.
height
left
node
height
right
node
height
left
node
1
height
right
node
height
left
node
height
right
node
1
.
Again, these assume the implied interpretations.
We would like to be able to state that the output of a sorting
routine is, in fact, sorted.
Let's assume we're sorting arrays into ascending order.
To talk about the elements of an array in a typical programming
language, we would write something like
AiAi.
For this example, we'll use that notation, even though it doesn't
quite fit the logic's syntax.
To describe sortedness (in non-decreasing order),
we want to state that each element is greater than or equal to
the previous one. However, just like in a program, we need to
ensure our formula doesn't index outside the bounds of the array.
For this example, we'll assume that an array's indices are
zero to (but not including)
sizeAsizeA.
sorted(A)≡∀≡i≡.≡1≤i∧i<sizeA⇒
A
i-1
<
A
i
sorted(A)
∀i.
1
i
i
size
A
A
i1
A
i
When proving things about programs, it's often useful to realize
there are alternate ways of defining things. So, let's see a couple
more definitions.
We could change our indexing slightly:
sorted(A)≡∀≡i≡.≡0≤i∧i<sizeA-1⇒
A
i
<
A
i+1
sorted(A)
∀i.
0 i
i
size
A
1
A
i
A
i
1
.
Or we could state that the ordering holds on every pair of elements:
sorted(A)≡∀≡i≡.≡∀≡j≡.≡0≤i∧i<sizeA∧0≤j∧j<sizeA∧i<j⇒
A
i
≤
A
j
sorted(A)
∀i.
∀j.
0 i
i
size A
0 j
j
size A
i j
A
i
A
j
.
This definition isn't any stronger, but it makes an additional
property explicit. Generally, you'd find it harder to prove that
this formula was true, but once you did, you'd find it easier to
use this formula to prove other statements.
The two preceding examples used functions like leftleft,
sizesize, and subtraction, although our logic syntax
doesn't include such functions.
However, we can rewrite any use of functions with appropriate new
relations.
As an example, rewrite
i<sizeA
i
size A
in proper first-order syntax.
We need a new relation that combines the syntax of
< and sizesize.
The result would look like
less-than-size(i,A)less-than-size(i,A).
This assumes the new relation has the obvious intended definition.
One simple WaterWorld fact is that if a location has no unsafe
neighbors, then its number of adjacent pirates is zero.
Furthermore, the implication goes both ways.
How would you state that as a WFF?
∀x.
(∀y.
nhbr(x,y)⇒safe(y)
⇔has0x)∀x.(∀y.
nhbr(x,y)
safe(y)
⇔has0 x)
How would you make a similar statement about the number of
adjacent pirates being one?
There are various solutions, but they all must capture the same idea:
there exists exactly one unsafe neighbor.
This solution states that in two parts:
-
There exists an unsafe neighbor, uu.
-
Every unsafe neighbor is uu.
Together, these two parts imply there is only one such
uu.
∀x.
(∃u.
nhbrxu∧¬safe(u)∧∀∧y∧.∧nhbr(x,y)⇒(¬safe(y)⇔y=u)
⇔has1x)∀x.(∃u.
nhbr x u
safe(u)
∀y.
nhbr(x,y)
(
safe(y)
⇔ y u )
⇔has1 x)
These statements are very similar to, and provable from, the
first-order WaterWorld domain axioms.
Some formulas can get pretty hairy:
∀x.
∃y.
∀z.
likes(x,y)∧¬likes(y,z)∀x.∃y.∀z.
likes(x,y)
likes(y,z)
.
The zeroth step is to take a breath, and read this in English —
for every xx, there's some yy such that for every zz,
xx likes yy but yy doesn't like zz.
Even so, how do we approach getting a handle on what this means?
Given an interpretation, how do we know it's true?
The top-down way would be to read this formula left-to-right.
Is the whole formula true?
Well, it's only true if, for every possible value of x,
some smaller formula is true
(namely,
“there exists a yy such that
forall zz, likes(x,y)likes(x,y)
and ¬likes(y,z)likes(y,z).”).
(This is a formula with xx free —
that is, it's a statement about xx.)
And is that formula true?
Well, precisely when we can find some yy such that … (and so on).
This direct approach is hard to keep inside your head all at once.
Most people prefer approaching the problems in a bottom-up manner
(or if you prefer, right-to-left or inside-out):
First consider at the small inner bits alone, figure out what they mean,
and only then figure out how they relate.
-
What does the innermost formula
“likes(x,y)∧¬likes(y,z)
likes(x,y)
likes(y,z)
”
mean, in English? That's not so bad:
xx likes yy, and yy dislikes zz.
A statement about three people called xx, yy, zz.
-
Working outward, what does
“∀z.
likes(x,y)∧¬likes(y,z)∀z.
likes(x,y)
likes(y,z)
”
mean? Ah, not so bad either:
xx likes yy, and yy dislikes everybody.
-
Keep on going:
“∃y.
∀z.
likes(x,y)∧¬likes(y,z)∃y.∀z.
likes(x,y)
likes(y,z)
”
becomes “xx likes some misanthrope”.
-
Now it's clear:
“∀x.
∃y.
∀z.
likes(x,y)∧¬likes(y,z)∀x.∃y.∀z.
likes(x,y)
likes(y,z)
”
is just “everybody likes some misanthrope”.
Phew!
We have already seen quite a few formulas of the general form
∀x.
P(x)⇒…∀x.
P(x)
…
.
Indeed, this is a very useful idiom:
If our domain is natural numbers but we want to say something
about all primes, we simply write
∀n.
prime(n)⇒…∀n.
prime(n)
…
.
Don't be fooled;
this formula is in no way suggesting that all numbers are prime!
This same construct using ∃∃ is usually a mistake. Consider
∃x.
P(x)⇒…∃x.
P(x)
…
.
By choosing xx to be any non-PP element,
this entire formula is true, without even glancing at what is inside
the “…”!
If you have to demonstrate that all ravens are black —
∀i.
isRaven(i)⇒isBlack(i)∀i.
isRaven(i)
isBlack(i)
— there are two ways to do so:
You can go out and find every raven and verify that it's black.
Alternately, you can go and find every non-black item, and verify
that it's a non-raven.
Epistemologists — philosophers dealing with how we humans
come to learn and know things (about, say, raven colors) —
go on to ponder about real-world degrees-of-belief:
If we have only looked at some ravens, and we find another raven and
confirm it is black, does this increase our degree of belief about
all ravens being black?
If so, then whenever we find a non-black item which is a non-raven,
this must also increase our degree of belief that all ravens
are black.
This leads to Hempel's (so-called) Paradox:
if we are looking for evidence to choose between two competing hypotheses
(say, “all non-black items are not ravens”
versus “all non-orange items are not ravens”),
then finding a purple cow increases our belief in both
of these hypotheses, simultaneously!