To Mock a Mockingbird by Raymond M. Smullyan should be required reading for any fan of the programming language Haskell. We learn combinatory logic through a series of delightful puzzles, almost without realizing.
We’re asked to imagine a forest populated by talking birds. On encountering one of these birds, we may call out the name of any bird. In response, the bird will say the name of some bird in the forest. (The reply could be the same bird we named, or the bird’s own name, or any other bird.)
An enchanted forest populated by birds is disarmingly endearing. We’re almost unaware we’re actually dealing with a set of functions that take a function as input and return a function. The evocative backdrop also pays homage to Haskell Curry, who was an avid birdwatcher.
One puzzle challenges us to find an egocentric bird given that a lark lives in the forest. Or, using mundane terminology, given a combinator L such that (Lx)y = x(yy) for all x and y, construct a combinator E such that EE = E.
The author states his solution is “a bit tricky” and consists of 12 correctly parenthesized Ls. Furthermore, the author states he doesn’t know if a shorter solution exists.
To maximize the likelihood of solving this puzzle, the reader should take advantage of facts learned from previous puzzles, and build up to the solution in stages. But that’s only if you’re playing fair and using pencil and paper! Instead, I saw an opportunity to bust out one of my favourite snippets of code.
Seeing the forest for the trees
Let us first recast the problem in terms of trees. Instead of Ls and parentheses, we work with syntax trees. In other words, we work with full binary trees where each leaf node corresponds to an L, and to evaluate an internal node, we recursively evaluate its child nodes, then apply the left child to the right child. (In Smullyan’s terminology, we call out the name of the bird represented by the right child to the bird represented by the left child.)
In this setting, the puzzle is to find a full binary tree such that repeatedly transforming parts of the tree according to the rule (Lx)y = x(yy) produces a tree where both of the root node’s children are identical to the original tree.
Hence to solve with brute force, we need only generate all full binary trees containing up to 12 leaf nodes, and for each one see if we can transform the tree into two copies of itself.
Here’s where my treasured routine comes in. The following roughly describes how to call a function on every full binary tree with exactly n leaf nodes:

Allocate a node x.

If n is 1, then mark x as a leaf, call the function, then return.

Otherwise mark x as an internal node, and for every 0 < k < n:

For every full binary tree y with k leaf nodes:

Set the left child of x to y.

For every full binary tree z with n  k leaf nodes:

Set the right child of x to z.

Call the function.



We generate the left and right subtrees by calling this algorithm recursively. More precisely, in Go:
type node struct { 
I was proud when I found this gem a few years ago while working on a Project Euler problem, though I’d be shocked if it were original. [Actually, my first version preallocated an array of 2n  1 nodes and used indices instead of pointers save a bit of time and space, but this is less elegant.]
For example, we can print the first 10 Catalan numbers:
func main() { 
Or print all full binary trees with exactly 6 leaf nodes, as parenthesized expressions:
func tree_print(p *node) { 
With a little more effort, we can write a program that solves the puzzle. However, some care is needed: if we replace subtrees of the form (Lx)y with x(yy) and vice versa without rhyme nor reason, we’ll have no idea when we’ll finish and we’ll only stumble across a solution by chance.
Instead, we observe that (Lx)y is either strictly smaller than x(yy), or has the form (Lx)L. Let us say that we are reducing the tree when we replace x(yy) by (Lx)y, and expanding when we perform the reverse. Thus rather than start from a tree t and repeatedly applying the rule to obtain the tree tt, we do the reverse: we start from tt, and consider reductions only. The above observation implies that every sequence of reductions must terminate eventually.
But what if we need to temporarily expand tt before reducing it in order to reach t? Let’s optimistically hope that Smullyan’s 12L solution was sufficiently expanded; that is, only reductions are needed to go from tt to t, where t is his solution.
Multiple subtrees may be reducible, and choosing the wrong one prevents future reductions necessary to reach the goal. We therefore try every path: for each possible reduction, we apply it and recurse. This leads to a problem of wastefully repeating many computations because there can be several ways to arrive at the same tree. We tackle this in an obvious manner: by remembering the trees we’ve seen so far.
I wrote solutions in GNU C and Go. The Go solution is a bit too slow for comfort. The C code is slightly clumsier mainly because I had to name the anonymous functions (though one can define a macro to work around this). C also lacks a map data structure, but this was no problem thanks to my recently released BLT library.
Results (Spoiler Alert)
Optimism paid off. On my laptop, my C program took well under a minute to find 4 solutions:
(((L(LL))(L(LL)))((L(LL))(L(LL))))
(((L(LL))((LL)L))((L(LL))((LL)L)))
(((L(LL))((LL)L))(((LL)L)((LL)L)))
((((LL)L)((LL)L))(((LL)L)((LL)L)))
The Go version took about 10 minutes.
These all contain 12 Ls, so in a sense, Smullyan’s solution is minimal. Since no other strings are printed, these four 12L solutions are minimal when only reductions are permitted.
If we allow expansions (that is, (Lx)y → x(yy)), then firstly, we have at least 2^{4} = 16 solutions of length 12, since in this case (L(LL)) and (LL(L)) are interchangeable, and secondly, we can reduce the above strings to find shorter solutions. For example, the solution:
(((L(LL))(L(LL)))((L(LL))(L(LL))))
reduces to:
(L((L(LL))(L(LL))))(L(LL))
which further reduces to:
((LL)(L(LL)))(L(LL))
which only has 8 Ls. I doubt Smullyan missed this. My guess is he meant that if you solve the problem the clever way, then you arrive at an expression with 12 Ls; reductions should be ignored because they only obscure the ingenuity of the solution.
Is there, say, a 7L expression that expands to some expression (that is necessarily longer than 24 Ls) which can be reduced to half of itself? I think not, but I have no proof.
Exercise: Four Fours
I wanted to do something with the Go version of the the forall_tree() routine, so I tweaked it to solve the four fours puzzle. I just ploughed through all possible trees and evaluated each one; there are only 320 to do. For larger variants of the puzzle, I’d use dynamic programming; that is, memoize subtrees and their values. Division by zero is annoying, but I managed to keep the tree evaluation function short and sweet by using Go’s panicrecover mechanism.
Recursion: the best thing since recursion
The forall_tree() function is but one example of the eloquence of anonymous functions and recursion. For similar reasons, nested functions are also indispensable. We attain economy of expression by letting the stack automatically take care of the heavy lifting.
Curiously, early structured languages including ALGOL, Simula, and Pascal supported nested functions, but C shied away from this beautiful feature.
Its absence is sorely missed, as can be seen in Cstyle callbacks. An ugly void * argument is inevitably passed and cast. For instance, take the udata parameter in the SDL audio API.
Its absence is also egregiously gratuitous. GCC supports nested functions by using this one weird trick [compiler writers hate him!]. One might complain this weird trick conflicts with executable space protection, but executable space protection is worse than pointless thanks to returnoriented programming.
Code Complete