Notes (LCBB 02/07): Conversion
2 Conversion
2.1 Definition of Lambda Terms
A lambda terms is either
A variable from the infinite set of variables
V = {v, v', v'', ...}
An application
(M N)
whereM
andN
are both lambda termsAn abstraction
(\x.M)
, wherex
is a variable andM
is a lambda term
Equivalent definition with de Bruijn indices:
A lambda term is a natural number {0, 1, 2, 3, ...}
or any pair of lambda terms.
- A variable is a natural number.
- An abstraction is any pair
(0 M)
, whereM
is a term. - All other terms are applications
2.2 Examples of Lambda Terms
Equivalent formulations:
v' = 2
v'v = 2 1
\ v ( v' v) = 0 (2 1)
((\ v (v' v)) v'') = (0 (2 1)) 3
(((\ v (\ v' (v' v))) v'') v''') = (0 ((0 (1 2)) 3)) 4
Notice that we can remove the heads (decapitate) all abstractions by using the variable number to indicate how may layers of abstraction the variable is bound in. In the last line 0 (0 (1 2))
is equivalent to \ x. \y. (y x)
.
De Bruijn indices are fun, but probably out of band for these notes. I plan on doing more with them later though.
2.3 Convention of Notation
Lower case letters are variables. Upper case letters are terms
M == N
denotes thatM
andN
are alpha equivalent. Two expressions are alpha equivalent if their Partial de Bruijn forms are identical. An expression may be converted into its Partial de Bruijn form by replacing every bound variable with the relative index of its binding abstraction layer, every\
followed by a variable with0
and leaving all free variables constant.(\ x y) z == (\ x y) z => (0 y) z (\ x x) z == (\ y y) z => (0 1) z (\ x x) z => (0 1) z \= z (\ x x) z => (0 1) z \= (\ x y) z => (0 y) z
Hmm, this seems awkward, but I find the definition in the book to be a little hand-wavy. Alpha equivalence can get really complicated when we start to think about name shadowing etc. I really want to just say that expressions are alpha equivalent if their de Bruijn index forms are equivalent, but I can’t because even though free variables are free, they should be consistent in equivalent expressions.
Applications are left-associative:
F M1 M2 == (F M) M2
In an abstraction
.
separates the head variable from the expression body\ x1. (\ x2. M) == \ x1 (\ x2 M)
Abstractions are right-associative, and
\ x1. \ x2. M == \ x1. (\ x2. M) == \ x1 (\ x2 M)
2.4 Definition of Free Variables
The set of free variables of an expression
M
, notatedfreeVariables M
are all indices in an expressions de Bruijn form which point to a relative abstraction layer outside ofM.
Equivalently, they are the variables that remain after convertingM
into it’s Partial de Bruijn form.M
is a closed term or combinator if it has no free variables.
2.7 Definition of Lambda Calculus
The principle rewrite rule is beta reduction (defined above)
There are also logical rules:
Equality:
- All terms are equal to themselves
- Equality is associative
- Equality is transitive
Compatibility rules:
- Applications of equal terms to other equal terms are equal
- Abstractions of the same variable over equal terms are equal (eta conversion)
If a statement
S
is provable by these rules, we write\ |- S
2.8 Remark on Alpha Conversion
I’m glad that the authors agree with me here that de Bruijn is the way to go. I personally think its easier to learn de Bruijn indices first. Alpha conversion is a lot more complicated than it seems the first time you learn it. But de Bruijn hides no complexity, at the expense of being very tedious to reduce by hand. But why are we reducing terms by hand? We live in the age of the computer!
2.10 SKI Combinators
Let
I = \ x. x
K = \ x y. x
K* = \ x y. y
S = \ x y z. x z (y z)
I
is the identity function, K
returns the first of two arguments, K*
the second of two arguments.
S
is trickier. It takes three arguments, and applies the application of the first to the third to the application of the second to the third. Think of it as amalgamating all three values.
Suppose we had the term \ x. X F
and we wanted to apply F
to X
. Then we could do:
S (\ x. K* x) (\ x. K x) (\ x. x X F) =
(\ x. K* x) (\ x. x X F) ( (\ x. K) (\ x. x X F)) =
(K* X F) (K X F) =
= F X
2.12 Fixed-Point Combinator
Suppose f
is a function. If there is an input x
to f
such that f(x) = x
, then x
is said to be a fixed-point of f
.
And let’s also suppose we have a function fix
that returns a fixed point x
of a function, so that fix(f) = x
.
Now we can do something neat! Let’s look at the two above equations together:
f(x) = x
fix(f) = x
Because a fixed point x
is the result of applying f
to x
and applying fix
to f
:
fix(f) = f(x) = x
And we can expand this further by substituting fix(f)
for x
in f(x)
:
fix(f) = f(x) = f(fix(f)) = f(f(x))
Since we can repeat the substitution as many times as we want, this implies:
f(x) = f(f(x)) = f(f(f(x))) = ...
Which makes total sense, since every application of f
in the above is just transforming x
to x
. So our fix
function is just recursively applying f
.
But can we build a lambda expression for fix
? We can!
Y = \ f. (\ x. f (x x)) (\ x. f (x x))
Y F = (\ x. F (x x)) (\ x. F (x x)) = F ((\ x. F (x x)) (\ x. F (x x)))
= F Y F
For a more intuitive explanation on how we might have invented the Y
combinator ourselves take a look at my notes on Rojas’ Tutorial Introduction to the Lambda Calculus
2.13 Example
- Simply:
Y F = F Y F => Y S = S Y S
G = Y S => G X = (Y S) X = S Y S X = S G X
Which makes sense, because by eta conversion (\ g x. S g x) = S
- If
G X = G G
thenG = \ x. G G
. This seems connected to ( g x . g g) in some way, but I’m not sure how to reason from one to the other…
Well, if using an expression that generates Y
doesn’t count as using Y
then:
G = (\ f. (\ x . x x) (\ x. f x x)) (\ g x. g g)
= (\f . (\ x . f x x) (\ x. f x x)) (\ g x. g g)
= Y (\ g x. g g)
= (\ g x. g g) (Y (\ g x. g g))
= \ x. (Y (\ g x. g g)) (Y (\ g x. g g))
= \ x. G G
But what if we can’t even use an expression that generates Y
?
Then we can just any another fixed point combinator. Let’s try the Z
combinator:
Z = \ f. (\ x. f (\ v. x x v)) (\ x. f (\ v. x x v))
Let’s prove the Z
combinator is a fixed point combinator by applying it to some function F
and showing that it satisfies Z F = F (Z F)
Z F = (\ f. (\ x. f (\ v. x x v)) (\ x. f (\ v. x x v))) F
= (\ x. F (\ v. x x v)) (\ x. F (\ v. x x v))
= F (\ v. (\ x. F (\ v. x x v))
(\ x. F (\ v. x x v))
v
)
= F (\ v. F (\ v. (\ x. F (\ v. x x v))
(\ x. F (\ v. x x v))
v
)
v
)
= F (\ v. (F (\ v. (\ x. F (\ v. x x v))
(\ x. F (\ v. x x v))
v
)
)
v
)
Now this is a bit tricky, but remember that each entry of this is still equal to Z F
. We’re going to use the third statement:
Z F = F (\ v. (\ x. F (\ v. x x v))
(\ x. F (\ v. x x v))
v
)
To do a substitution in the last statement:
Z F = F (\ v. ( F (\ v. (\ x. F (\ v. x x v))
(\ x. F (\ v. x x v))
v
)
)
v
)
= F ((\ v. Z F) v)
= F (Z F)
So Z
is a fixed point combinator. Therefore,
G = Z (\ g x . g g)
= (\ g x . g g) (Z (\ g x . g g))
= \ x. (Z (\ g x . g g)) (Z (\ g x . g g))
= \ x. G G
Furthermore, this shows that for any fixed point combinator fix
,
G = fix (\ g x . g g) => G X = G G
2.14 Definition of Numerals
Let
F
be a lambda term, andn
be a natural number. Then let us notated repeated application ofF
toM
withF^0 M = M F^1 M = F M F^(n+1) M = F (F^n M)
The Church numerals
C_0, C_1, C_2, ...
are defined byC_n = \ f x. f^n x
For an alternative presentation, see 2 Arithmetic, Notes: Tutorial Intro to Lambda Calculus (Rojas)
2.15 Definition of Arithmetic Operations
Let’s prove that
add = \ x y p q . x p (y p q);
mult = \ x y z. x (y z)
exp = \ x y. y x
are expressions that perform addition, multiplication and exponentiation, respectively.
Addition
Okay, so a valid addition expression add
will have the following property
add C_n C_m = C_(n + m)
That is, the addition of two church numerals representing the natural numbers n
and m
will be the church numeral representing their sum n + m
By the definition of church numerals:
C_(n + m) = \ f x . (f^(n + m) x)
\ f x. f^(n + 0) M) = \ f x . (f^(n) x)
\ f x. f^(n + 1) M) = \ f x . f (f^(n) x)
By induction on the definition of the ^
notation:
\ f x. f^(n + 0) M) = \ f x . f^0 (f^n x)
\ f x. f^(n + 1) M) = \ f x . f^1 (f^n x)
\ f x. f^(n + 2) M) = \ f x . f^2 (f^n x)
...
\ f x. f^(n + m) M) = \ f x . f^m (f^n x)
Then if add
is:
add = \ m n f x . m f (n f x)
add C_m C_n = \ f x. (C_m f) (C_n f x)
= \ f x. ((\ g x. g^m x) f) ((\ f x. f^n x) f x)
= \ f x. (\ x. f^m x) (f^n x)
= \ f x. f^m (f^n x)
= \ f x. (f^(n + m) x)
= C_(n + m)
Q.E.D.
Multiplication
A valid multiplication expression will have the following property:
mult C_n C_m = C_(n * m)
If mult = \n m t. n (m t)
:
mult C_n C_m = (\n m t. n (m t)) (\ f x. f^n x) (\ f x. f^m x)
= \ t. (\ f x. f^n x) ((\ f x. f^m x) t)
= \ t. (\ f x. f^n x) (\ x. t^m x)
= \ t. (\ f x. f^n x) (\ x. t^m x)
= \ t. (\ x. (\ x. t^m x)^n x)
By eta conversion (\ x. F x) = F
and by the lemma 2.16, (x^m)^n = x^(m*n)
mult C_n C_m = \ t x. (t^m)^n x
= \ t x. (t^(m * n)) x
= C_(m * n)
Exponentiation
A valid exponentiation expression will have the following property:
exp C_n C_m = C_(n ^ m)
If exp = \ x y. y x
then
exp C_n C_m = (\ x y. y x) C_n C_m
= C_m C_n
= (\ f x. f^m x) C_n
= \ x. (C_n)^m x
= \ x. C (n ^ m) x [By lemma 2.16]
= C (n ^ m)
Exercises
2.1
(i):
M1 = v ( \ v' ((v' v'') (\ v''' (\ v'''' (v'' v''')))))
(ii):
M2 = \ x y. (\ z. z) x y (y ((\ p. x p) y))
2.2
M
and N
are expressions with some number of free instances of x
and y
(maybe none, maybe some). If we rewrite all instance of x
in M
with N
then the free instances of y
in N
are now within M[x:= N]
. If we further rewrite all instances of y
in M[x:= N]
, with L
so as to obtain M[x:= N][y:= L]
we will replace every y
in M
with L
and every y
in every N
in L
.
Therefore,
M[x:=N][y:=L] = M[x:= N[y:=L]][y:=L]
Since there are no free instances of x
in L
, we can reorder the rewrites:
M[x:= N[y:= L]][y:= L] = M[y:= L] [x:= N[y:= L]]
2.3
(i):
\ |- M1 = M2 => \ |- \x.M1 = \x.M2 => \ |- (\x.M1) N= (\x.M2) N => \ |- M1[x:= N] = M2[x:= N]
(ii):
\ |- M1 = M2 => \ |- \x.M1 = \x.M2 => \ |- (\x.M1) N1 = (\x.M2) N1 \ |- (\x.M1) N1 = (\x.M2) N1 && \ |- N1 = N2 => \ |- (\x.M1) N1= (\x.M2) N2 => \ |- M1[x:= N1] = M2[x:= N2]
2.4
See Definition of Arithmetic Operations
2.5
N = X (Y Z)
2.6
(i):
(\x y z. z y x) a a (\p q. q) = (\p q. q) a a = a
(ii):
(\y z. z y) ((\x. x x x) (\x. x x x)) (\w. I) = (\w. I) ((\x. x x x) (\x. x x x)) = I
(iii):
SKSKSK = (\x y z. x z (y z)) K S K S K = (K K (S K)) S K = K S K = S
2.7
(i):
KI = (\x y. x) (\ x . x) = \ y. \x . x = \ y x. x = K*
(ii):
SKK = \ z. K z (K z) = \ z . z = I
2.8
(i):
F = \x y. x (y x) y
(ii):
F = \m n l. n (\x. m) (\y z. y l m)
2.9
(i):
F = \x. x I
(ii):
F = \x y. x I y
2.10
(i):
K_n = Y K
(ii):
F = Y (\x y. y x) => F x = (\ x y. y x) (Y (\x y. y x)) x = x F
(iii):
F = Y (\ w x y z. w K) => F I K K = (\ w x y z. w K) (Y (\w x y z. w K)) I K K = (Y (\x y z. w K)) K = F K
2.11
I don’t really understand what the comma in square brackets means, but I’ll try anyway.
For all expressions C
(that are a function of some number of arguments), there exists an expression F
such that for all argument vectors x~
, F x~ = C F x~
.
For any given C:
F = Y (\ x. C x) = (Y (\x. C x)) = (\ x. C x) F = C F
Which gives F x~ = C F x~
via substitution.
2.12
\y . (\ x y. y)
(i): By defintion of
#
,P # Q => \ + (P = Q) |- M = N
for allM, N
in lambda, true and false are in lambda, thereforeP # Q => \ + (P = Q) |- true = false.
If
\ + (P = Q) |- true = false
, then for anyM, N
in lambdatrue = false => K = K* => K M = K* M => K M N => K* M N => M = N
By definition, if
\ + (P = Q) |- M = N
for allM, N
thenP # Q
. Therefore,P # Q <=> \ + (P = Q) |- true = false
(ii):
I = K => I I = K I => I = K* => K = K*
(iii):
F = (\p . (G p) y x)` G = (\g. g K g) G I = I K I = K I = K* G K = K K K = K F = \p. ((\g. g K g) p) y x F I = (\p. ((\g. g K g) p) y x) I = ((\g. g K g) I) y x = (I K I) y x = (K I) y x = K* y x = x F K = (\p .((\g. g K g) p) y x) K = ((\g. g K g) K) y x = (K K K) y x = K y x = y
(iv):
K = S => K K I = S K I => K = I => K = K*
2.13
<expression> := <name> | <abstraction> | <application>
<name> := v' | v'' | v''' | v''' | ...
<abstraction> := \ <name> <expression>
<application> := <expression> <expression>