# Intuitively Deriving the $Y$ Combinator

$$ \newcommand{\one}[1]{ \color{red}{ #1 } } \newcommand{\two}[1]{ \color{blue}{ #1 } } \newcommand{\three}[1]{ \color{green}{ #1 } } \newcommand{\four}[1]{ \color{magenta}{ #1 } } $$The $\lambda$ calculus is designed to be a model of computation, so it may be surprising that it has

Quadrupling ($Q$) was chosen over doubling ($D$) and tripling ($T$) because the names $D$ and $T$ are used for other functions later in the post.

## Conditionals and Booleans in $\lambda$ calculus

Like

One could argue that an algorithm such as:
Algorithm" `A(x)`

:
1. Return `A(x)`

, which contains no base case, is still a recursive algorithm

. On the contrary, Donald Knuth states in *The Art of Computer Programming* (Third Edition) that [an] algorithm must always terminate after a finite number of steps

.

Any conditional computation if $C$ then $A$ else $B$

is choosing between two results, $A$ and $B$, based on the value of $C$. If $C$ is `true`

, then $A$ is returned, and if $C$ is `false`

, $B$ is returned. In $\lambda$ calculus, `true`

and `false`

are simply defined to return the first and second argument, respectively:
$$ \underbrace{ \lambda xy.x }_\text{true} \hspace{20pt} \text{ and } \hspace{20pt} \underbrace{ \lambda xy.y }_\text{false} $$

Usage of $\lambda$ booleans differ slightly from usage of booleans in most systems. Instead of needing to be wrapped in an if-statement or some other construct, they are themselves functions which do the choosing operation. In order to express:
$$\text{if $x \geq 0$ then $1$ else $-1$}$$
(

Note that $-1$ is *not* the function $-$ applied to the value $1$ but rather the single unit negative one

Also: $\geq$, $0$, $1$, $-1$ and all other symbols which are not parameters are not real

$\lambda$ calculus constructs, but merely shorthand. Assignment does not exist in $\lambda$ calculus. These values only stand for some unspecified functions, with the promise that these functions work together as expected. For instance, that $\geq00 = \text{true}$, $+24=6$, etc. This is elaborated on in another note.

Because assignment doesn't really exist and shorthand is not real

$\lambda$ calculus, I tried to avoid using it in this post, generally opting for $\underbrace{\text{underbace}}$ and $\overbrace{\text{overbrace}}$ labeling notation instead. However, how $\geq$, $1$, $-1$ and the rest of the numbers are defined is out of the scope of this post, so I had little choice but to use shorthand for these items.

The behavior of $sign$ is not universally agreed upon. Wikipedia defines $sign(0)=0$ (and also calls the function $sgn$), and $sign(0)$ is sometimes left undefined. I personally have found it useful (particularly in programming) to have $sign(0)=1$. I use that definition here because it's a simpler implementation, requiring only one conditional branch rather than two.

Thus $sign(2)=1$ and $sign(-3)=-1$: $$ \begin{array}{ccc} (\overbrace{\lambda x. (\geq 0 x)(1)(-1)}^{sign})2 & \hspace{20pt} & (\overbrace{\lambda x.(\geq 0 x)(1)(-1)}^{sign})(-3) \\ (\geq 0 2)(1)(-1) && (\geq 0 (-3))(1)(-1) \\ (\underbrace{\lambda ab.a}_\text{true})(1)(-1) && (\underbrace{\lambda ab.b}_\text{false})(1)(-1) \\ 1 && -1 \end{array} $$

## Back to Recursion

If we try to write $Q$ in $\lambda$ calculus, we will quickly realize that it isn't so simple. It certainly looks *something* `RECUR`

?

The $Z$ function is just $isZero$, defined as: $$ Z(x) := \begin{cases} \text{true} & x = 0 \\ \text{false} & x \neq 0 \end{cases} $$

Note that $Q^R$ is not some operation $\cdot^R$ applied to $Q$ but rather just another variable name.

Recursion is self-reference, so our implementation of $Q$ needs to have access to itself in its body. However, lambdas *to* itself as an arguemnet? Something like:
$$ \underbrace{\lambda \color{red}{R}x.Zx0(+4(\color{red}{R}(-x1)))}_{Q^R} $$
where $R$ is just $F$uck, there's an issue

. $Q^R$ expects its first argument to be the recursive function $R$, but we're only passing it the $x$ value. We can fix this by passing $R$ itself within the body of $Q$:
$$ \underbrace{\lambda Rx.Zx0(+4(\color{red}{RR}(-x1)))}_{Q^{RR}} $$
and indeed this works:
$$ \begin{align*}
& \overbrace{\one{(\lambda Rx.Zx0(+4(RR(-x1))))}}^{Q^{RR}} \ \two{Q^{RR}} \ \three 2 \\
&= \one{Z\three{2}0(+4(\two{Q^{RR}Q^{RR}}(-\three{2}1)))} \\
&= \one{+4(\two{Q^{RR}Q^{RR}}(-\three{2}1))} \\
&= \one{+4(\two{Q^{RR}Q^{RR}}\three{1})} \\
&= \one{+4(\two{(\lambda Rx.Zx0(+4(RR(-x1)))){Q^{RR}}}\three{1})} \\
&= \one{+4(\two{Z\three{1}0(+4({Q^{RR}}(-\three{1}1))))})} \\
&= \one{+4(\two{+4({Q^{RR}}(-\three{1}1)))})} \\
&= \one{+4(\two{+4(\four{{Q^{RR}}}{Q^{RR}}\three{0}))})} \\
&= \one{+4(\two{+4(\four{(\lambda Rx.Zx0(+4RR(-x1))))}{Q^{RR}}\three{0}))})} \\
&= \one{+4(\two{+4\four{(Z\three{0}0(+4\two{{Q^{RR}}}(-\three{0}1))))})})} \\
&= \one{+4(\two{+4\four{((\overbrace{\lambda ab.a}^\text{true})0(+4\two{{Q^{RR}}}(-\three{0}1))))})})} \\
&= \one{+4(\two{+4\four{0})})} \\
&= \one{8}
\end{align*} $$

In general, a function in double-$R$

form should, when passed to itself, produce the desired recursive function. We can wrap up this passing-to-self

or self-invokation

into

$I$ is a fun function because $II =_\beta II$: $$ \begin{align*} & II \\ &= (\lambda f.ff) (\lambda g.gg) \\ &= ff[f / (\lambda g.gg)] \\ &= (\lambda g.gg) (\lambda g.gg) \\ &= II \end{align*} $$

$$ \overbrace{\lambda f.ff}^I $$ so that we can just write $IQ^{RR}$ or, for any double-$R$ $f$, $If$. If $f$ is a function in double-$R$ form, $If$ is the desired recursive function.

Now I don't know about you, but writing functions in double-$R$ form seems like a bit of a pain to me. I don't want to have to remember to write $RR$ instead of $R$ each time I write a recursive function. I want $R$ to be the real

recursion function. Let's make something to do this $R$-doubling for us. Call it $D$ for double

:
$$ \overbrace{ \lambda fR.f(RR) }^D $$

Now we can use the simpler formulation $Q^R$ instead of $Q^{RR}$ and have $D$ do the work for us: $$ \begin{align*} & (\overbrace{\one{\lambda fR.f(RR)}}^D) \ (\overbrace{\two{\lambda Rx.Zx0(+4(RR(-x1)))}}^{Q^R}) \\ &= \one{\lambda R.(\two{\lambda Rx.Zx0(+4(RR(-x1)))})(RR)} \\ &= \one{\lambda R.(\two{\lambda x.Zx0(+4(\one{RR}(-x1)))})} \\ &= \underbrace{\lambda Rx.Zx0(+4(RR(-x1))))}_{Q^{RR}} \\ \end{align*} $$

In general, $D$ should take a recursive function from single-$R$ form to double-$R$ form. And we know that a function in double-$R$ form may be transformed into the desired recursive function via $I$. Thus, if we have a function $f$ in single-$R$ form, $I(Df)$ is the desired recursive function. And $I \circ D$ takes a function from single-$R$ form to recursive form. But what is $I \circ D$? It's the same as $\lambda f. I(Df)$, which is

$$ \begin{align*} & \one{\lambda f.\two{I}(\three{D}f)} \\ &= \one{\lambda f.\two{(\lambda f.ff)}(\three{(\lambda fR.f(RR))}f)} \\ &= \one{\lambda f.\two{(\lambda f.ff)}\three{(\lambda R.\one{f}(RR))}} \\ &= \one{\lambda f.\three{(\lambda R.\one{f}(RR))}\three{(\lambda R.\one{f}(RR))}} \\ &= \lambda f. (\lambda x.{f}(xx))(\lambda x.{f}(xx)) \\ \end{align*} $$ which is the $Y$ combinator!Though we can write e.g. $$ S := \lambda abc.b(abc) $$ this kind of definition of $S$ is only shorthand, and must be expanded to (i.e., replaced with) its definition before evaluation. $\lambda$ calculus does not have any rules on its own to evaluate these kinds of expressions.

If we write recursive shorthand such as:
$$ F := \lambda x. Fx $$
attempting to expand this for evaluation will cause infinite recursion, so it can never be evaluated as real $\lambda$ calculus. We could, of course, create a *new* system which allows for these things, but this new system would not be $\lambda$ calculus.