bitbashing

Thu, 20 Nov 2008

Switching to Pyblosxom, and a colophon

Until recently I had been using on bitbashing blosxom, a minimalist blog system which stores each entry as a flat file on disk. My existing workflow relies heavily on tools like emacs for editing and merging and monotone for revision control, and it is nice to have a blog system that plays well with these other tools, rather than using, say, a MySQL database as the storage layer and an AJAX widget as editor. However over time blosxom has seemed less and less maintained, and I started looking for alternatives.

Today I switched to pyblosxom, which started as a clone of blosxom, and still has much the same philosophy, but seems to have many advantages and useful features as compared to blosxom. A description of the upgrade process along with a site colophon are after the jump.

Initially I installed pyblosxom (using Gentoo's emerge command) and copied pyblosxom.cgi to my existing CGI directory where I was serving blosxom. It was quite easy to switch between blosxom and pyblosxom simply by changing which CGI I would direct requests to using Apache mod_rewrite directives in my .htaccess file. The process overall was quite seamless, and only relatively minor changes were needed, for instance pyblosxom seems to have split the date flavour entry into date_head and date_foot. All I needed to do here was rename my date flavour to date_head.

Initially, I was running pyblosxom as a CGI, however upon testing the site with ab, the Apache benchmark tool, I realized that a python process was being started for each hit to the site, which was not exactly optimal. In the past mod_perl had taken care of this for me. Fortunately, pyblosxom works very well with WSGI, the Web Server Gateway Interface, which is a standard interface between web servers and Python applications. After installing mod_wsgi (using the version of mod_wsgi 2.3 found in Gentoo's portage), ab showed dramatic reductions in the average and worse case (99th percentile) response latency. In the 99th percentile, response time went from several seconds to less than 300 ms, while the 50th percentile also showed substantial reductions, down to under 150 ms. This is not exactly blazingly fast, but I could always turn on pyblosxom's static rendering mode if I needed to handle more than a few thousand hits a day.

I had been using blosxom in suexec mode, so the easiest thing to do with WSGI was to set up a process pool which would spawn a new process to handle the requests as a distinct user from the main web server user identity. This also seemed a good mode in which to set an upper bound on the amount of resources pyblosxom can consume. I am currently using this line in my Apache config, which creates a process poll which is shared by pyblosxom instances running for bitbashing and the news page for botan:

WSGIDaemonProcess randombit.net processes=2 threads=4 maximum-requests=2000 user=lloydwww

By default mod_wsgi will use many threads in a single process, but in tests I found the WSGI pyblosxom was much better able to deal with load using multiple processes as compared to threads. Under high load, a single process, even with many threads enabled, would never use more than a single CPU worth of processing time. In contrast I found that enabling more than one process allowed a great deal more parallelism, and latencies with many (I tested 10 to 30) concurrent requests were kept relatively stable.

I'm not sure if this effect is something specific to my platform, and if so, the reason for it. One possible cause may be that randombit.net is actually a Virtuozzo VPS being run by Aktiom Networks in Denver. It seems possible, though a bit odd, that Virtuozzo's scheduler perhaps gives different weight to processes versus threads in a way that affects mod_wsgi's performance here.

One thing that had really bothered me about continuing to use blosxom is that many of the plugins listed in the blosxom plugin registry point to long-dead links. That did not really give me a good feeling about the future prospects of blosxom, and I was happy to see that pyblosxom seems to have a much more active plugin development scene. Unfortunately the plugin system is very poorly documented in pyblosxom, but what I have seen of plugin sources so far suggests that one can do reasonably interesting things without much effort. Plugins I'm currently using on this site with pyblosxom include readmore, pyentrynavi, and entrycache.

posted 2008/11/20 19:53 [category: bitbashing / about]

< Robot packs will hunt 'non-cooperative' humans | Thanksgiving 2008 Recipe Wrapup >