Saturday, April 10, 2010

Self-publishing with CreateSpace

From time to time, somebody sends me a kind email saying that they only truly appreciated Git after encountering my Git guide. One such reader had already bought a few Git books, and he suggested I should therefore turn my website into a book.

I had idly thought about doing this, but why bother? Was: FREE, Now: $9.95!? However, the email made me realize that some seek information by buying books first, then look around online if they want more. Making a book out of my guide might be a good idea after all: I’m not trying to sell it to people who already know they can read it for free; rather, I’m aiming for those who might not otherwise find it until much later because they visit bookshops before search engines.

CreateSpace

Because the most renowned technical publishers already offered books on Git, I chose to self-publish on CreateSpace. Their tools are free, and they list your work on Amazon (who owns CreateSpace). I’d love to have bricks-and-mortar bookshops carry copies of the book too, but an Amazon listing should be enough for now.

In a brief search, I found controversy over CreateSpace ISBNs, but Richard Sutton’s post reassured me: firstly, for my book, the issues stemming from CreateSpace being the registered owner of the ISBN are irrelevant, and secondly, if you really want you can have an ISBN registered in your name (but you’ll have to buy it yourself).

The whole process is not quite free. After submitting your PDF file, you must order a proof copy. If you find errors, you submit a corrected PDF, and repeat. I made a stupid mistake the first time, so I went through this cycle twice and finished down about 16 bucks.

It’s not all bad though. I was surprisingly pleased to hold my book in my hand, as it felt like I had accomplished something. Also, in print form, the same old sentences become more authoritative and strangely convincing. Online, they look like stuff that some guy posted on some random website.

Preparing the book took much longer than expected. I had mentioned to a reader that I was considering making a book. I tried follow advice he gave me so it would look less amateurish. I cut a chapter and an appendix. I added an index. I renamed headings so they were more descriptive. I replaced all variables (e.g. "SHA1_HASH") in the command-line examples with values (e.g. "1b6d"). I selected a 6 inch by 9 inch form factor, which meant I had to shorten some lines to get them to fit. While doing all this, I found poorly spelled words, poorly worded paragraphs and poorly organized sections. I doubt I caught them all.

To avoid further delays, I used their Easy Cover Creator. Perhaps I’ll revisit this eventually, as I want a more spartan look: something like Kernighan and Ritchie’s "The C programming language". Or perhaps a sort of cheat sheet so the book would be useful even while shut.

I set the price to $9.95 USD, which means I get 2 bucks or so per sale. I considered a lower price, but I’ll be lucky to make my $16 back as it is! Still, it ought to be low enough that a buyer won’t be too annoyed when they find out the material is freely available on my homepage. (I would have linked to the free version from the book description, but this is forbidden.)

AsciiDoc, xsltproc, fop

I had some trouble with my tool chain that produces PDFs from text files. AsciiDoc produces a DocBook XML file out of the source text, which xsltproc turns into an XSL-FO file, which fop renders into a PDF. The design of the various formats probably have technical merit, but I found it difficult to figure out how to get what I wanted.

For example, I replaced variables with values because I could not italicize them easily with AsciiDoc. The only methods I discovered destroyed the natural beauty of the source text.

It seems the smaller the detail, the larger the effort required to tune it. Changing page sizes, font sizes and chapter heading styles was easy enough to figure out, but I still don’t know the right way to insert a blank page after the front matter so the first chapter starts on an odd page. I gave up editing some XSL file or other. Instead, I scripted a fragile search-and-replace on the XSL-FO output.

Nonetheless, I stand by my choices. There’s something appealing about source files which resemble old-school text files. Also, once the configuration nightmare is over, editing is simple: I can use any text editor, and the tool chain will automatically produce several HTML versions as well as a reasonable PDF for a book.

Shameless plug

I couldn’t possibly end this post without a link to my book: "Git Magic". It’s the most important book you’ll ever have, or my name is not Winston! Buy it now!

Wednesday, April 7, 2010

Nginx and FastCGI

One perk I miss from my first days of grad school was my office computer with a permanent IP address. I could run all sorts of servers. (Later, the environment became harsher because unlike me, many of us ran Windows, but like me, they did not know how to do so securely. The IT team restricted most ports as the first line of defence, though you could ask for exceptions. Hopefully they didn’t tighten control further after I graduated.)

Wanting to be cool, I experimented with PHP when it started becoming popular. Until then, I had only dabbled with CGI programs in compiled languages. PHP was intoxicating. A Common Gateway drug, so to speak. In those Web 1.0 days, dynamic webpages were so easy and fun to make with PHP that I overdosed.

Years later I finally admitted to myself that my content was static apart from a few needless gimmicks, and this was unlikely to change. Using PHP was only increasing the CPU load. It didn’t matter because I received few hits, but it offended me as a computer scientist. I sobered up and returned to vanilla HTML.

I’ve been thinking how I might write a web application today, and I realized I’ve come full circle. I’ve lost my taste for LAMP stacks at a time when they are more widespread than ever, and once again espouse compiled languages.

Nginx

Firstly, I’ve moved on from Apache, which was once my favourite web server. I used to watch Netcraft's market share graphs so I could cheer on Apache against commercial products. But one day, I noticed a newcomer on the graphs. A strange jumble of letters: "nginx". I couldn’t resist looking it up.

Nginx shows how powerful pure unadulterated C can be in the right hands. Written by Igor Sysoev, this webserver runs on numerous platforms, hardly using any memory even as it handles thousands of requests at light speed. Nginx has somehow dodged Jeff Darcy’s Four Horsemen of Poor Performance.

Nginx cannot do CGI, but it can do FastCGI, which is a plus. Instead of spawning a new process for every request, FastCGI spawns a long-lived program once, which communicates with the webserver when necessary, possibly over a network. Of course, this program can run threads of its own if desired.

An advantage of LAMP stacks was that scripts could run without requiring a new process or thread. FastCGI puts all languages on the same footing. In fact, FastCGI is more flexible: for example, you can restart FastCGI programs independently of webservers. Perhaps this is why some run PHP via FastCGI.

Compiled languages

I prefer a compiled language to a scripting language like PHP because I crave speed and scalability. Also, one feature of PHP is useless to me: I discovered I lack the discipline to mix code with HTML. At first I found it was convenient, but eventually my webpages became hard to maintain. I now insist on strict separation between languages: my CSS, JavaScript, HTML, and whatever else ideally reside in distinct files.

Also, now that JavaScript is ubiquitous, it seems best to push as much work as possible to the client side: the FastCGI should do the minimum possible and supply its results (perhaps in JSON) to JavaScript which then plays with the data using the client’s CPU. This diminishes the need for a language designed to mingle with HTML.

Running a web application with a scripting language purportedly allows rapid prototyping, but it seems the only drawback to a compiled language is a compilation step and a FastCGI program restart. This is negligible provided your language has a fast compiler (like C and Go). Besides, I bet much of the development cycle involves presentation tweaks, that is, edits to CSS, HTML, and JavaScript: not the compiled language.

The L and M of LAMP

I’d still run my servers on Linux. I’ve had good results with it so far. As for MySQL, I cannot say, having never experimented much with databases. Its reputation seems solid enough.

How-to

On the latest Ubuntu, you’ll need to install the packages nginx, spawn-fcgi, libfcgi-dev. Then edit the nginx configuration file in /etc/nginx/sites-available/default. In the server clause, add something like:

location = /test {
fastcgi_pass 127.0.0.1:9000;
fastcgi_param QUERY_STRING $query_string;
}

The file /etc/nginx/fastcgi_params contains other parameters you might want to pass. Restart nginx, by running for example:

$ sudo /etc/init.d/nginx restart

Visiting http://localhost/test should result in a 502 error because no FastCGI program is running yet.

Let’s fix this. In C, I recommend using fcgiapp.h and not fcgi_stdio.h; it’s not much more trouble, and you avoid conflicts with the standard stdio library.

#include <fcgiapp.h>

int main() {
FCGX_Stream *in, *out, *err;
FCGX_ParamArray envp;
while (FCGX_Accept(&in, &out, &err, &envp) >= 0) {
char *q = FCGX_GetParam("QUERY_STRING", envp);
FCGX_FPrintF(out, "Content-type: text/plain\r\n\r\n");
if (!q) {
FCGX_FPrintF(out,
"no query string: check web server configuration\n");
}
FCGX_FPrintF(out, "Query: '%s'\n", q);
}
return 0;
}

Compile your code:

$ gcc a.c -lfcgi

Then spawn the binary on your machine on port 9000:

$ spawn-fcgi -a 127.0.0.1 -p 9000 -n -- a.out

Test it by visiting http://localhost/test?example.

In a real application, you might want to run the binary as a daemon, and place the process ID in a temporary file for easy access:

$ spawn-fcgi -a 127.0.0.1 -p 9000 -P /tmp/pid -- a.out

I had planned to continue this post by writing about embedding HTML files in C, and fetching data with JavaScript but it’s too long as it is. Some other time maybe.

Monday, April 5, 2010

At last


I don't play much DDR anymore. Making it through Max 300 on Heavy mode was an old goal I thought I had no hope of achieving. But several days ago I had energy to burn and played a few rounds on a whim. Oddly, my game has improved despite lack of practice. The arrows felt slower than I remember. I tried Max 300 almost as a joke, and was amazed that I finally passed it.

Now I have to work my way up to an A!