Thursday, May 14, 2009

Notable Notation

Pursuing a graduate degree in computer science at Stanford offered many perks. I benefited from at least one more than some of my colleagues, as I was assigned an office only a few doors away from Don Knuth's.

I'd occasionally catch a glimpse of the legendary computer scientist, for some reason usually donning a bicycle helmet. However, the proximity of his office alone seems to have had an effect: I switched to LaTeX for typesetting. As an undergraduate I loyally used Lout, whose author, Jeff Kingston, had an office near where I worked. In seriousness, it was probably more for practical reasons than out of respect: all my co-authors used LaTeX.

I still have a soft spot for Lout with its clean and comprehensible internals. Compare with the inscrutable black-magic LaTeX style sheets built on top of TeX, which itself is tricky to grasp. However, the TeX equation syntax certainly deserves its status as the lingua franca of mathematicians communicating via email and other plain text mediums.

My workplace is no longer near Knuth or Kingston. Perhaps not entirely coincidentally, I no longer use TeX or Lout. Wanting documentation sources to be as readable as much as possible, I devised a markup language inspired by venerable ASCII typesetting tricks of heavy email users and README authors, such as *asterisks*, _underscores_, and ==Headings==, and wrote a script to translate it to HTML. Luckily I didn't get far before discovering AsciiDoc.

Equation-heavy notes, books, HTML, PDF: AsciiDoc handles them all despite using source files that look like ordinary text documents. Easy to learn, easy to type, and easy on the eyes. Fine-tuning the final layout is difficult, a blessing in disguise as users have solid excuses for ignoring those typesetting nitpicks TeXperts are expected to correct.

But I digress: back to name-dropping Donald E. Knuth. Although he was warm and friendly, I only spoke to him a few times as he was also busy. However, I had many chances to listen to him. Sometimes he'd give a talk in the meeting area outside my door, where I always had a guaranteed seat: I could simply wheel my office chair a few meters.

Regrettably I recall almost nothing. There was a story about how noticing a pattern in table of numbers yielded a PhD thesis; he made it sound so easy. I also vaguely remember a fascinating introduction to coroutines along with an obscure but accessible application: generating certain Gray code sequences. This would describe all I have left from Knuth's talks, but happily, some of his Computer Musings were recorded and are freely available online.

The Notation episode is surprisingly interesting, especially the historical tidbits. The notation for floor and ceiling functions only became standard relatively recently, a brilliant idea that usurped many inferior schemes and rapidly took over the world. Another young fad is to use "lg" for the base 2 logarithm, though now it seems we should use "lb" instead.

The highlight was Iverson's bracket convention: one places an expression within square brackets, and the whole thing evaluates to 1 if the expression is true, and 0 otherwise. For example, [k is even] means 1 when k is even, and 0 otherwise. Other examples:
  • sign(x) = [x > 0] - [x < 0]
  • [A and B] + [A or B] = [A] + [B]
"Two notes on notation" describes how to harness the power of the square bracket.

This should all sound familiar to C programmers, because many C operators evaluate to 1 when the result is true and 0 otherwise. It's simultaneously disheartening and amusing that while mathematicians and computer scientists are discovering the utility of this convention, some programming language designers actively oppose it by introducing a boolean data type that cannot be interchanged with integers.

The lecture was crammed with other ideas that deserve to be widespread. I discuss them somewhere I can use AsciiDoc and LaTeX equation notation.

No comments: