Thursday, November 12, 2009

It's Go time

At last, the Go programming language has been publicly released so I can write about it without getting into big trouble.

Go was designed by programmers I hold in high regard. Although it is a new language, the same people already experimented with some of its ideas at least ten years ago.

Back then, I saw a demonstration of the Inferno operating system, the successor to Plan 9, which in turn was the sequel to UNIX. I remain awed that just a handful of programmers could implement a complex system so well. The venerable UNIX crew seem to make the right decision often, and make it ten years ahead of everyone else.

For example, the dis byte code ran on a register-based virtual machine and hence ran fast. In contrast, its chief competitor, Java, has a stack-based virtual machine, and it seemed to take years to realize that this was a poor solution. The problem is so severe that JIT compilation was introduced, weakening the "Write Once, Run Anywhere" mantra: the statement becomes almost meaningless if one requires some sort of compiler for every platform, as opposed to a simple byte code interpreter. Another workaround is to design a register-based virtual machine for Java from scratch.

Another case in point is Limbo, a novel programming language that accompanied Inferno. Sadly Limbo has far fewer adherents than Java, for non-engineering reasons. Like Plan 9 (which remained closed-source until 2000), Inferno was hard to obtain, whereas Java was being given away: indeed, users were once practically forced by their browsers to install Java virtual machines. The relative obscurity of Limbo is therefore expected, as with any human language possessing few speakers and few opportunities for growth.

A second chance

It’s much more exciting this time around. The powers that be have wisely made Go open source. Now that Go is playing on a level field, ideas that Limbo should have popularized finally have a chance to spread far and wide.

My favourite feature is something they left out. As with Limbo, there is no inheritance. There is no stifling type system. At least one generation of programmers has been trained to use a rigid type system, and I believe history will one day prove the inheritance mindset to be a passing fad. I have strong feelings about this topic because I was once a firm believer in inheritance, and it took years to realize my mistake.

Go instead emphasizes interfaces, and provides a simple concise syntax for them. A few neat lines perform the equivalent of several messy lines in C that deal with structs of function pointers.

Go also has strong concurrent programming features, which it shares with Limbo. Different threads (actually, "goroutines") communicate via channels (folowing the CSP model), eliminating race conditions. Channels are essentially type-safe UNIX pipes: they capture the power and delight of writing shell scripts to string together several tools.

Among lesser niceties: nested functions, anonymous functions, multiple return values, the package and import keywords, untyped numerical constants, reflection. Most of my wishes for C have been granted.

I’ll probably mostly stick with C. I’ve grown accustomed to its flaws, and I like squeezing every drop of performance out of code without dropping down to assembly. But for those tasks that require more than a shell script but less than a C project, Go might just fit the bill.

Sunday, November 8, 2009

Project Euler

Lately my spare time has been eaten by an MMORPG: Project Euler. Even though I have played few RPGs, I’m sure Project Euler is one of the most difficult and challenging.

Actually, it’s really not an RPG. Rather than reward the clicking of buttons, the only way to gain levels in Project Euler is to submit answers to puzzles. Each involves a little bit of computer science and a little bit of mathematics.

Along the way, I wrote a library to make it easier to do arbitrary precision arithmetic in C. For example, to solve problem 97, I could write:

  mpz_t z;
mpz_init(z);
mpx_eval("mpz a; a = (28433 * 2^7830457 + 1) % 10^10;", z);
gmp_printf("%Zd\n", z);

Strangely, Project Euler only lets you try at most 25 problems over n days, where n is your current level (except at level 0, when it is 1). Perhaps I should be grateful, as bumping into this limit gives me time to post code!

Monday, November 2, 2009

I'm right, you're wrong

I enjoy ranting about programming languages, whatever the medium: blog posts, emails to friends, colleagues and strangers, verbally to anyone within earshot. Mostly I rail against the evils of object-oriented languages, and why C is still the one true language.

Over the years, I’ve amassed enough material to fill several pages, which I’m now launching. Hopefully if I concentrate all these tirades in one place I’ll expend less energy overall on arguing!

Monday, August 24, 2009

Napkin Cryptography

I was a cryptography researcher in a past life. Occasionally I experience echoes from my previous incarnation: a paper review here, a technical question there, so I still feel somewhat connected.

A couple of weeks ago I bumped into Daniel J. Bernstein and Tanja Lange, whom I had last seen in 2003 at a conference in Chicago. [Thanks to Dan for inviting me, and also for taking everyone out to the downtown bars and restaurants. I had a blast!] Only then did I realize how far I had fallen from the light. In a hurried notes-on-a-napkin conversation I tried to catch up on what was once my field:

2048 is the new 1024

Firstly, I was mildly surprised when I learned governments and standards bodies are actually heeding warnings that 1024-bit RSA keys are crackable by large corporations and botnets, thus now mandate 2048-bit keys.

Ah, the memories. Just how vulnerable are 1024-bit keys? Back in the ivory tower, this was the subject of an intense high-profile debate, with Bernstein on one side, and Lenstra, Shamir, Tomlinson and Tromer on the other. The heated arguments seemed to get personal at times, providing excellent fodder for gossip amongst us grad students.

I have no idea if the parties ever reconciled, nor if the fireworks have even subsided, but I’m relieved that everybody agrees 1024 bits is insufficient nowadays.

Fast ECC

More interesting to me are the new breed of elliptic curve cryptography (ECC) implementations. Dan instructed me to "forget everything you know about elliptic curves" before launching into the world’s fastest course on the world’s fastest algorithms for elliptic curves.

Consider a unit circle, x2 + y2 = 1. We can define a group operation on the points of the unit circle via angle addition. It’s easiest to describe with polar coordinates I suppose: cis(α) composed with cis(β) gives cis(α+β). Except for some reason we want (0,1) to be the identity so make that cis(α+β-π).

In Cartesian coordinates, if the two input points are (x1, y1) and (x2, y2) then we get (x1 y2 + x2 y1, y1 y2 - x1 x2). Let us denote this point by (a, b).

An Edwards curve is a deformed circle that is equivalent to an elliptic curve in some sense. There is some deal about requiring a point of order 4, which is probably related to the quadrilateral symmetry of the Edwards curve. It is parameterized by d which I’m guessing is akin to the j-invariant, and has the equation x2 + y2 = 1 + d x2 y2.

Using the above notation, group addition produces the point (a / (1 + D), b / (1 - D)) where D = d x1 x2 y1 y2. Unlike elliptic curves, the same formula applies for identical inputs, and the also for the identity element: in paricular, there is no special formula for point doubling. From this equation one can write code to multiply points at breakneck speed.

Montgomery curves are given by B y2 = x3 + A x2 + x, and possess fast point addition with one big string attached: you can only add P and Q if you know P - Q; this point forms the base of an addition "ladder". This is fine for exponentiations and hence ECDSA, but may be unsuitable for other cryptosystems.

The details, and more, can be found at the Explicit-Formulas Database. Also, see this highly optimized implementation of the curve 25519.

DNSCurve

To Dan, speedy ECC is merely a means to an end: a brilliant and diabolical scheme to usurp DNSSEC with DNSCurve. If DNSSEC does indeed have the drawbacks he described, then I wish him well. Kill it before it spreads! The Denial-of-Service attack amplification issue is enough to justify revoking its netizenship.

Pairings

For the PBC library, the good news is that Weierstrass curves are still the best known way to get at pairings (because of the order 4 point brouhaha), though research on alternate curves for pairing computation continues at a furious pace.

The bad news is that the sample pairing parameters I’ve been distributing are mostly too weak. Additionally, Barreto-Naehrig curves, the least-optimized case in the library, are arguably now the most important pairing type.

By the way, our discussion literally involved scribbling on a serviette:



Alas, I am not a coauthor of this paper: the handwriting belongs exclusively to Bernstein and Lange.

Thursday, August 13, 2009

Tune out

I had grand plans for my Android game. Apart from actually completing it, I was going to fix bugs, and add sound and music.

But my enthusiasm evaporated as fast as it had arrived, and now that I have forgotten the password to my key, it looks even less likely I’ll release new versions.

I did get around to composing a tune for use somewhere in the game. Seeing as it almost certainly never find its way to the game, I may as well dump it here:

I lacked the patience to learn how LilyPond deals with "da capo al fine". Luckily it’s clear where it belongs; this is one of those A-A-B-A melodies.

Another problem was excessive whitespace in the generated PNG image. My older post must have used lilypond-book in its original incarnation, but that was back when I wrote raw icky HTML.

There must be some lilypond option to automatically crop the output, but I found manual labyrinthine. So instead, to produce the above image I ran something like:

$ lilypond --png foo.ly
$ convert -trim -border 8x8 -bordercolor white foo.png foo.png

Thursday, July 30, 2009

ZDD fun

I finally released the source for my logic puzzle solvers using ZDDs. I was hoping to finish at least the Nurikabe solver first, but I lost interest.

Mirrors:

I used the code to help construct the following Slither Link puzzle:

2...23
..3...
..3..2
.2.22.
....22
1..1.1

ZDDs are fascinating data structures, and using them to solve a logic puzzle is more fun that the puzzle itself. I hope to revisit them some day.

Monday, June 8, 2009

Much Ado About Nothing

Whitespace in XML files can behave unintuitively, especially when it comes to newlines. My script that converts AsciiDoc to Blogger posts entry is peppered with awful kludges meant to bring them in line, but still fails to indent source code.

After much experimentation, and as little reading of the official XML specifications as possible, I have determined the best practices for my situation:

  • Adding xml:space="preserve" to the opening content tag preserves the source code indentation.

  • It appears impossible to preserve newlines anywhere. They vanish mysteriously, not even leaving a space behind. By default XML leaves a space, so I don’t understand why; perhaps it’s an Atom thing. Thus I add "<br />" at end of each line where formatting matters, and prefix each newline with a space.

My updated script appears below, this time with indentation.

#!/bin/bash

if [[ -z "$1" ]] ; then
echo Usage: $0 ASCIIDOC_SOURCE [LABELS...]
exit 1
fi

if [[ ! -f "$1" ]] ; then
echo $1 not found.
exit 1
fi

outfile=$1.xml
# Extract = Title =, which must be on a line by itself.
title=$(grep '^ *=' -m 1 $1 | sed 's/^ *=* *//' | sed 's/ *=* *$//')

# Hmm, this draft thing used to work, but not anymore.
echo '<entry xmlns="http://www.w3.org/2005/Atom">
<app:control xmlns:app="http://www.w3.org/2007/app">
<app:draft>yes</app:draft>
</app:control>
<title type="text">'$title'</title>
<content type="xhtml" xml:space="preserve">
<div xmlns="http://www.w3.org/1999/xhtml">' > $outfile
# We need \n newlines for sed.
asciidoc -a newline=\\n -s -o - $1 >> $outfile
echo '
</div>
</content>' >> $outfile
while [[ -n $2 ]]; do
echo ' <category scheme="http://www.blogger.com/atom/ns#" term="'"$2"'" />' >> $outfile
shift
done
echo '</entry>' >> $outfile

# I can't figure out how to preserve line breaks in Atom XML,
# hence the following.
# Prefix each newline with a space.
sed -i 's/$/ /' $outfile
# Add <br /> to the end of all lines between <pre> and </pre>.
sed -i '/<pre>/,/<\/pre>/s/ *$/<br \/>/' $outfile
# Undo what we just did for the </pre> line.
sed -i 's/\(<\/pre>.*\)<br \/>/\1/' $outfile

if [[ -z $AUTH_TOKEN ]]; then
stty -echo
read -p "Blogger password: " pw
stty echo

token=$(curl --silent https://www.google.com/accounts/ClientLogin \
-d Email=benlynn@gmail.com -d Passwd="$pw" \
-d accountType=GOOGLE \
-d source=asciidoc2blogger \
-d service=blogger | grep Auth | cut -d = -f 2)
AUTH_TOKEN=$token
echo AUTH_TOKEN=$token
fi

# The URL was cut and pasted from <link rel="service.post"> from
# my blog's HTML source.
curl --silent --request POST --data "@$outfile" \
--header "Content-Type: application/atom+xml" \
--header "Authorization: GoogleLogin auth=$AUTH_TOKEN" \
"http://www.blogger.com/feeds/4222267598459829544/posts/default" \
| tidy -xml -indent -quiet