Notes (LCBB 02/07): Conversion

2 Conversion

2.1 Definition of Lambda Terms

A lambda terms is either

A variable from the infinite set of variables
```
V = {v, v', v'', ...}
```
An application (M N) where M and N are both lambda terms
An abstraction (\x.M), where x is a variable and M is a lambda term

Equivalent definition with de Bruijn indices:

A lambda term is a natural number {0, 1, 2, 3, ...} or any pair of lambda terms.

A variable is a natural number.
An abstraction is any pair (0 M), where M is a term.
All other terms are applications

2.2 Examples of Lambda Terms

Equivalent formulations:

v' = 2
v'v = 2 1
\ v ( v' v) = 0 (2 1)
((\ v (v' v)) v'')  = (0 (2 1)) 3
(((\ v (\ v' (v' v))) v'') v''') = (0 ((0 (1 2)) 3)) 4

Notice that we can remove the heads (decapitate) all abstractions by using the variable number to indicate how may layers of abstraction the variable is bound in. In the last line 0 (0 (1 2)) is equivalent to \ x. \y. (y x).

De Bruijn indices are fun, but probably out of band for these notes. I plan on doing more with them later though.

2.3 Convention of Notation

Lower case letters are variables. Upper case letters are terms
M == N denotes that M and N are alpha equivalent. Two expressions are alpha equivalent if their Partial de Bruijn forms are identical. An expression may be converted into its Partial de Bruijn form by replacing every bound variable with the relative index of its binding abstraction layer, every \ followed by a variable with 0 and leaving all free variables constant.
```
(\ x y) z == (\ x y) z => (0 y) z
(\ x x) z == (\ y y) z => (0 1) z
(\ x x) z => (0 1) z \= z
(\ x x) z => (0 1) z \= (\ x y) z => (0 y) z
```
Hmm, this seems awkward, but I find the definition in the book to be a little hand-wavy. Alpha equivalence can get really complicated when we start to think about name shadowing etc. I really want to just say that expressions are alpha equivalent if their de Bruijn index forms are equivalent, but I can’t because even though free variables are free, they should be consistent in equivalent expressions.
Applications are left-associative:
```
F M1 M2 == (F M) M2
```
In an abstraction . separates the head variable from the expression body
```
\ x1. (\ x2.  M) ==  \ x1 (\ x2 M)
```
Abstractions are right-associative, and
```
\ x1. \ x2. M == \ x1. (\ x2.  M) ==  \ x1 (\ x2 M)
```

2.4 Definition of Free Variables

The set of free variables of an expression M, notated freeVariables M are all indices in an expressions de Bruijn form which point to a relative abstraction layer outside of M. Equivalently, they are the variables that remain after converting M into it’s Partial de Bruijn form.
M is a closed term or combinator if it has no free variables.

2.7 Definition of Lambda Calculus

The principle rewrite rule is beta reduction (defined above)
There are also logical rules:

Equality:
- All terms are equal to themselves
- Equality is associative
- Equality is transitive
Compatibility rules:
- Applications of equal terms to other equal terms are equal
- Abstractions of the same variable over equal terms are equal (eta conversion)
If a statement S is provable by these rules, we write \ |- S

2.8 Remark on Alpha Conversion

I’m glad that the authors agree with me here that de Bruijn is the way to go. I personally think its easier to learn de Bruijn indices first. Alpha conversion is a lot more complicated than it seems the first time you learn it. But de Bruijn hides no complexity, at the expense of being very tedious to reduce by hand. But why are we reducing terms by hand? We live in the age of the computer!

2.10 SKI Combinators

Let

I  = \ x. x
K  = \ x y. x
K* = \ x y. y
S  = \ x y z. x z (y z)

I is the identity function, K returns the first of two arguments, K* the second of two arguments.

S is trickier. It takes three arguments, and applies the application of the first to the third to the application of the second to the third. Think of it as amalgamating all three values.

Suppose we had the term \ x. X F and we wanted to apply F to X. Then we could do:

S (\ x. K* x) (\ x. K x) (\ x. x X F) =
(\ x. K* x) (\ x. x X F) ( (\ x. K) (\ x. x X F)) =
(K* X F) (K X F) =
= F X

2.12 Fixed-Point Combinator

Suppose f is a function. If there is an input x to f such that f(x) = x, then x is said to be a fixed-point of f.

And let’s also suppose we have a function fix that returns a fixed point x of a function, so that fix(f) = x.

Now we can do something neat! Let’s look at the two above equations together:

f(x) = x
fix(f) = x

Because a fixed point x is the result of applying f to x and applying fix to f:

fix(f) = f(x) = x

And we can expand this further by substituting fix(f) for x in f(x):

fix(f) = f(x) = f(fix(f)) = f(f(x))

Since we can repeat the substitution as many times as we want, this implies:

f(x) = f(f(x)) = f(f(f(x))) = ...

Which makes total sense, since every application of f in the above is just transforming x to x. So our fix function is just recursively applying f.

But can we build a lambda expression for fix? We can!

Y = \ f. (\ x. f (x x)) (\ x. f (x x))

Y F = (\ x. F (x x)) (\ x. F (x x)) =  F ((\ x. F (x x)) (\ x. F (x x)))
  = F Y F

For a more intuitive explanation on how we might have invented the Y combinator ourselves take a look at my notes on Rojas’ Tutorial Introduction to the Lambda Calculus

2.13 Example

Simply:

Y F = F Y F => Y S = S Y S

G = Y S => G X = (Y S) X = S Y S X  = S G X

Which makes sense, because by eta conversion (\ g x. S g x) = S

If G X = G G then G = \ x. G G. This seems connected to ( g x . g g) in some way, but I’m not sure how to reason from one to the other…

Well, if using an expression that generates Y doesn’t count as using Y then:

G = (\ f. (\ x . x x) (\ x. f x x)) (\ g x. g g)
  = (\f . (\ x . f x x) (\ x. f x x)) (\ g x. g g)
  = Y (\ g x. g g)
  = (\ g x. g g) (Y (\ g x. g g))
  = \ x. (Y (\ g x. g g)) (Y (\ g x. g g))
  = \ x. G G

But what if we can’t even use an expression that generates Y?

Then we can just any another fixed point combinator. Let’s try the Z combinator:

Z = \ f. (\ x. f (\ v. x x v)) (\ x. f (\ v. x x v))

Let’s prove the Z combinator is a fixed point combinator by applying it to some function F and showing that it satisfies Z F = F (Z F)

Z F = (\ f. (\ x. f (\ v. x x v)) (\ x. f (\ v. x x v))) F
    = (\ x. F (\ v. x x v)) (\ x. F (\ v. x x v))
    = F (\ v. (\ x. F (\ v. x x v))
              (\ x. F (\ v. x x v))
              v
        )
    = F (\ v. F (\ v. (\ x. F (\ v. x x v))
                      (\ x. F (\ v. x x v))
                      v
                )
              v
        )
    = F (\ v. (F (\ v. (\ x. F (\ v. x x v))
                      (\ x. F (\ v. x x v))
                      v
                 )
              )
              v
        )

Now this is a bit tricky, but remember that each entry of this is still equal to Z F. We’re going to use the third statement:

Z F = F (\ v. (\ x. F (\ v. x x v))
              (\ x. F (\ v. x x v))
              v
        )

To do a substitution in the last statement:

Z F = F (\ v. ( F (\ v. (\ x. F (\ v. x x v))
                        (\ x. F (\ v. x x v))
                        v
                  )
              )
              v
        )
    = F ((\ v. Z F) v)
    = F (Z F)

So Z is a fixed point combinator. Therefore,

G = Z (\ g x . g g)
  = (\ g x . g g) (Z (\ g x . g g))
  = \ x. (Z (\ g x . g g)) (Z (\ g x . g g))
  = \ x. G G

Furthermore, this shows that for any fixed point combinator fix,

G = fix (\ g x . g g) => G X = G G

2.14 Definition of Numerals

Let F be a lambda term, and n be a natural number. Then let us notated repeated application of F to M with
```
F^0 M = M
F^1 M = F M
F^(n+1) M = F (F^n M)
```
The Church numerals C_0, C_1, C_2, ... are defined by
```
C_n = \ f x. f^n x
```

For an alternative presentation, see 2 Arithmetic, Notes: Tutorial Intro to Lambda Calculus (Rojas)

2.15 Definition of Arithmetic Operations

Let’s prove that

add  = \ x y p q . x p (y p q);
mult = \ x y z. x (y z)
exp  = \ x y. y x

are expressions that perform addition, multiplication and exponentiation, respectively.

Addition

Okay, so a valid addition expression add will have the following property

add C_n C_m = C_(n + m)

That is, the addition of two church numerals representing the natural numbers n and m will be the church numeral representing their sum n + m

By the definition of church numerals:

C_(n + m) = \ f x . (f^(n + m) x)

\ f x. f^(n + 0) M) = \ f x . (f^(n) x)
\ f x. f^(n + 1) M) = \ f x . f (f^(n) x)

By induction on the definition of the ^ notation:

\ f x. f^(n + 0) M) = \ f x . f^0 (f^n x)
\ f x. f^(n + 1) M) = \ f x . f^1 (f^n x)
\ f x. f^(n + 2) M) = \ f x . f^2 (f^n x)
...
\ f x. f^(n + m) M) = \ f x . f^m (f^n x)

Then if add is:

add  = \ m n f x . m f (n f x)

add C_m C_n = \ f x. (C_m f) (C_n f x)
            = \ f x. ((\ g x. g^m x) f) ((\ f x. f^n x) f x)
            = \ f x. (\ x. f^m x) (f^n x)
            = \ f x. f^m (f^n x)
            = \ f x.  (f^(n + m) x)
            = C_(n + m)

Q.E.D.

Multiplication

A valid multiplication expression will have the following property:

mult C_n C_m = C_(n * m)

If mult = \n m t. n (m t):

mult C_n C_m = (\n m t. n (m t)) (\ f x. f^n x) (\ f x. f^m x)
             = \ t. (\ f x. f^n x) ((\ f x. f^m x) t)
             = \ t. (\ f x. f^n x) (\ x. t^m x)
             = \ t. (\ f x. f^n x) (\ x. t^m x)
             = \ t. (\ x. (\ x. t^m x)^n x)

By eta conversion (\ x. F x) = F and by the lemma 2.16, (x^m)^n = x^(m*n)

mult C_n C_m = \ t x. (t^m)^n x
             = \ t x. (t^(m * n)) x
             = C_(m * n)

Exponentiation

A valid exponentiation expression will have the following property:

exp C_n C_m = C_(n ^ m)

If exp = \ x y. y x then

exp C_n C_m = (\ x y. y x) C_n C_m
            = C_m C_n
            = (\ f x. f^m x) C_n
            = \ x. (C_n)^m x
            = \ x. C (n ^ m) x   [By lemma 2.16]
            = C (n ^ m)

Exercises

2.1

(i):

M1 = v ( \ v' ((v' v'') (\ v''' (\ v'''' (v'' v''')))))

(ii):

M2 = \ x y. (\ z. z) x y (y ((\ p. x p) y))

2.2

M and N are expressions with some number of free instances of x and y (maybe none, maybe some). If we rewrite all instance of x in M with N then the free instances of y in N are now within M[x:= N]. If we further rewrite all instances of y in M[x:= N], with L so as to obtain M[x:= N][y:= L] we will replace every y in M with L and every y in every N in L.

Therefore,

M[x:=N][y:=L] = M[x:= N[y:=L]][y:=L]

Since there are no free instances of x in L, we can reorder the rewrites:

M[x:= N[y:= L]][y:= L] = M[y:= L] [x:= N[y:= L]]

2.3

(i):

\ |- M1 = M2 =>
\ |- \x.M1 = \x.M2 =>
\ |- (\x.M1) N= (\x.M2) N =>
\ |- M1[x:= N] = M2[x:= N]

(ii):

\ |- M1 = M2 =>
\ |- \x.M1 = \x.M2 =>
\ |- (\x.M1) N1 = (\x.M2) N1

\ |- (\x.M1) N1 = (\x.M2) N1 && \ |- N1 = N2 =>
\ |- (\x.M1) N1= (\x.M2) N2 =>
\ |- M1[x:= N1] = M2[x:= N2]

2.4

See Definition of Arithmetic Operations

2.5

N = X (Y Z)

2.6

(i):

(\x y z. z y x) a a (\p q. q) = (\p q. q) a a = a

(ii):

(\y z. z y) ((\x. x x x) (\x. x x x)) (\w. I) =
(\w. I) ((\x. x x x) (\x. x x x)) =
I

(iii):

SKSKSK = (\x y z. x z (y z)) K S K S K = (K K (S K)) S K = K S K = S

2.7

(i):

KI = (\x y. x) (\ x . x) = \ y. \x . x = \ y x. x = K*

(ii):
```
SKK = \ z. K z (K z) = \ z . z = I
```

2.8

(i):
```
F = \x y. x (y x) y
```
(ii):
```
F = \m n l. n (\x. m) (\y z. y l m)
```

2.9

(i):
```
F = \x. x I
```
(ii):
```
F = \x y. x I y
```

2.10

(i):
```
K_n = Y K
```

(ii):

F = Y (\x y. y x) => F x = (\ x y. y x) (Y (\x y. y x)) x = x F

(iii):

F = Y (\ w x y z.  w K)  =>
F I K K = (\ w x y z. w K) (Y (\w x y z. w K)) I K K
        = (Y (\x y z. w K)) K
        = F K

2.11

I don’t really understand what the comma in square brackets means, but I’ll try anyway.

For all expressions C (that are a function of some number of arguments), there exists an expression F such that for all argument vectors x~, F x~ = C F x~.

For any given C:

F = Y (\ x. C x) = (Y (\x. C x)) = (\ x. C x) F = C F

Which gives F x~ = C F x~ via substitution.

2.12

\y . (\ x y. y)

(i): By defintion of #, P # Q => \ + (P = Q) |- M = N for all M, N in lambda, true and false are in lambda, therefore P # Q => \ + (P = Q) |- true = false.

If \ + (P = Q) |- true = false, then for any M, N in lambda
```
true = false => K = K* => K M = K* M => K M N => K* M N => M = N
```
By definition, if \ + (P = Q) |- M = N for all M, N then P # Q. Therefore,
```
P # Q <=> \ + (P = Q) |- true = false
```
(ii): I = K => I I = K I => I = K* => K = K*

(iii):

 F = (\p . (G p) y x)`
 G = (\g. g K g)
 G I = I K I = K I = K*
 G K = K K K = K

 F = \p. ((\g. g K g) p) y x

 F I = (\p. ((\g. g K g) p) y x) I
     = ((\g. g K g) I) y x
     = (I K I) y x
     = (K I) y x
     =  K* y x
     = x

 F K = (\p .((\g. g K g) p) y x) K
     = ((\g. g K g) K) y x
     = (K K K) y x
     = K y x
     = y

(iv): K = S => K K I = S K I => K = I => K = K*

2.13

<expression> := <name> | <abstraction> | <application>
<name> := v' | v'' | v''' | v''' | ...
<abstraction> := \ <name> <expression>
<application> := <expression> <expression>