P2 - Why does the Y combinator work?
______________________________________________________________________

1. How can we model recursive functions?

Suppose we want to model a recursive function f. In Haskell, we might write:
	f x = ... f e ...
In the lambda calculus, however, we cannot write:
	f = \x-> ... f ...
because the RHS is not closed.  I.e., the name f is not bound in the RHS. Since the only way to bind names in lambda is by abstraction and application, we must abstract from the name f:
	r = \f -> \x -> ... f ...
Now the RHS is closed, but we do not have the function that we need.  We  have instead an abstraction r over our function f.  To get f, we must  supply f as a parameter, i.e.:
	r f => f
This means that f is a fixed point of r.  To get f back, then, we need a HO function Y that takes the fixed point of an arbitrary function r.  That is, we want:
	Y r => f such that r f = f
______________________________________________________________________

2. What does Y look like?
Clearly,
	Y = \r -> F
where F represents the fixed point we want.  I.e., r F <=> F Since F takes our fixed point, it must recursively apply r to itself. We have already seen a recursive function:
	Omega = (\x->x x)(\x->x x)
Omega just loops, without doing any productive work.  We need F to behave  like Omega, but doing something productive in each step, so:
	F = (\x-> ... (x x)) (\x-> ... (x x))
In this expression, (x x) will reduce again to F.  Since we want F to  reduce to r F (i.e., producing the fixed point we want), we write:
	F = (\x->r (x x)) (\x->r (x x)) => r F
and therefore:
	Y = \r -> (\x->r (x x)) (\x->r (x x))
Now we have a closed expression for Y which takes the fixed by of r by  recursive applications of itself.
______________________________________________________________________

3. Does this really work?
f = Y r
	= (\x->r (x x)) (\x->r (x x))
	= (r (\x->r (x x)) (\x->r (x x)))
	= r f
which is what we want.
______________________________________________________________________
