warning: Creating default object from empty value in /home/townxorg/public_html/modules/taxonomy/taxonomy.module on line 1364.

Gizmo on Ubuntu Dapper

I installed Gizmo today on my Z60t, for making cheap calls from my computer to landlines. Rates are low (1.5 cents per minute from the US to UK). As far as I can tell (it seems too good to be true) I can also make calls to other Gizmo users to free: so if I can get Nicola to sign up, I can call her for free at home or on her mobile from anywhere where I have an internet connection.

Anyway, installation went smoothly, and everything seemed to be going OK. I installed these packages after downloading them from

  1. bonjour
  2. libsipphoneapi (the ALSA package)
  3. gizmo-project

The only problem was my microphone seemed to keep cutting out. In the end, I tracked this down to some issue with the Gizmo application trying to reset the Capture levels in my volume control. To fix it, I set my Gizmo audio preferences like this (Echo cancellation and Gain control both turned off: Echo cancellation seems to overload my slow home connection, which caused some cutting out; and I think the Gain control was creating other problems):

And my volume control preferences like this:

The other issue I had was my firewall seemed to be blocking the client application. It worked when I turned my firewall off, so I'm pretty sure this was the problem. This article explains how to setup your firewall. In my case, I allow all outgoing UDP and TCP traffic, so I just had to allow incoming UDP traffic on ports 5004, 5005, and 64064. If you block outgoing traffic, you need to allow outgoing TCP port 7070 and all outgoing UDP ports above 1023.

Another tip: you can test your settings from the Gizmo client by looking up "echo" in the phone book. This dials through to an echo client, which enables you to hear yourself talking.

Once all this was done, the whole thing worked really nicely. The client is really smooth and integrates well with the rest of the Gnome desktop. Though I think Nicola wasn't too impressed by my choice of Electric 80's for my hold music; nor was she impressed by the array of sound effects (cheers, boos, tiger, thunder) I played during our test conversation.

Variable pricing for open source

Matt Asay makes some good points about a company called Trium, which provides its customers with the option of variable pricing. Customers can pay anywhere from 50% of the quoted price to over 100% of the quoted price. He suggests that open source could follow a similar model: get customers to pay for the value of the software to them, rather than paying for licenses. I think this is the right kind of approach, though I'm not sure how you'd administer it, or how you'd present it to a customer.

This reminds me of my last job, where we were using MySQL for some big applications. We hadn't paid a penny for it, and were doing our own support. A colleague and I suggested we buy a license anyway (this was in the days before MySQL Network, around version 3.23), as we were getting so much value from the product. It would be a gesture of good will. Our manager agreed and we bought a license, which we just stuck up on a wall. If you provide something great, people will pay for it. Yes, people are generally quite tight with cash; but they will pay for something if they feel they are getting value for money (which is why the Pound Shop in Northfield, where I live, is always packed on Saturdays).

Ruby Tuesday: libxml, closures

I've been doing more work on my S3 code this week, fixing a few bugs, adding features. Not ready for a new release just yet, as the functionality is a bit brittle and incomplete, but getting there.

While on this hackathon, I decided to use libxml properly for the first time (following Coda Hales' recommendation). I'm using it to parse responses from S3. It works a damn sight faster than REXML., even to the point of making a visible difference when running the test suite. Anyway, I thought it would be useful to give a pointer on how to parse a string using it, as this is not too obvious from the documentation, and I had to work it out from the libxml-ruby test suite. Here it is:

require 'xml/libxml'

def get_xml_doc(xml_str)
  parser =
  parser.string = xml_str
  # next line returns an XML::Document instance

I also discovered that on Linux, the libxml-ruby gem doesn't appear to work: to get anything happening, I had to install using extconf.

For the first time, I also discovered a real use for closures, and understood why symbols are important. I had a piece of code like this:

# metadata
@name = doc.find('//ListBucketResult/Name').to_a.first.content
@delimiter = doc.find('//ListBucketResult/Delimiter').to_a.first.content
@prefix = doc.find('//ListBucketResult/Prefix').to_a.first.content
@marker = doc.find('//ListBucketResult/Marker').to_a.first.content
@max_keys = doc.find('//ListBucketResult/MaxKeys').to_a.first.content
@is_truncated = doc.find('//ListBucketResult/IsTruncated').to_a.first.content

Now, this is very repetitive and wasteful. The other issues are that if an element does not exist (and doc.find(...) returns nil), you get a horrible error; and the paths could be refactored so '//ListBucketResult' is added to each automatically. So potentially the code needs to be refactored into a function. But, by thinking of a closure as a way of refactoring repeated similar function calls, I realised I could collapse this to:

# prop: a property to set
# path: path to find the XML element in the response
prop_setter = lambda do |prop, path|
  node = doc.find("//ListBucketResult/#{path}").to_a.first
  self.send("#{prop}=", node.content) if node

# metadata, 'Name'), 'Delimiter'), 'Prefix'), 'Marker'), 'MaxKeys'), 'IsTruncated')

Much neater. I've got my function (generated by the call to lambda), but it's just local to this block of code, and doesn't need to be declared as a method in the class definition. Also note that the name of the variable (:name, :prefix etc.) is passed to the prop_setter function, not its value: this means within the function, I can set the instance variable by referencing the name of the variable. If I'd done:, 'Name')

This would have passed the value of the name instance variable to the function, not the pointer to the variable itself. It's quite an obscure distinction, but I was really pleased I happened across a situation where I found a real use case for symbols which helped me understand them. (Although, thinking about it, because I'm using send, I could have just used a string for the variable name, I suppose...)

S3 "golden buckets"

A neat solution to the problem of how to back a service with S3 while getting someone else to pay for the storage. The basic idea is for a supplier to sponsor a premium ("golden") bucket on S3, for which the consumer pays Amazon; the consumer pays slightly more than for a standard bucket, Amazon takes a cut, and the supplier takes a cut. I think it could work.

Thoughts on rsync and S3

rsync is a great utility. I bloody love it, and use it to do all my backups: over the network via SSH onto remote filesystems, and locally onto USB drives. I also use it to restore my home directory when I switch machine. Great.

However, the network part of my backup solution is expensive (in money). I use Strongspace at the moment, which works really well. But it is too expensive: $8 a month for 4Gb of storage, which means I have to keep a close eye on my files to make sure nothing massive goes over and maxes out my account. There are alternatives, of course, and the GDrive might solve all of my problems; but for now I'm stuck with not enough network storage.

S3 gives me the right price, but there is no mature client solution which does the job for me (JungleDisk looked promising, but it seems a bit buggy on my Linux machine - it sometimes just hangs - and Natilus doesn't properly mount WebDAV shares onto the filesystem, so I can't rsync to it). So I've been working on an S3 library for Ruby to allow me to sync a local filesystem to S3 automatically, only transferring the files which have changed.

I've been reading up on rsync a bit to work out what it does, and it's pretty clever: here's a technical report which explains the inner workings. It does a block by block copy from the source file to the target file, using a so-called "rolling checksum" on each block to decide whether: a) the source contains the block, but the target does not; or b) the block in the source file exists somewhere in the target file. In this way, only changed blocks are copied from the source file to the target file (plus some checksums and block indexes). This is what makes rsync so fast.

I'm not sure whether you could work things out the same way with S3. Each resource (accessible via a key into a bucket) on S3 has an MD5 checksum associated with it, so you could get a checksum for individual blocks or file fragments; but I'm not sure I'd want to split a file over multiple S3 keys to be able to do block-level copying. Although a bucket can contain as many keys as you like (even though you are limited to 100 buckets), so this might be possible. But to retrieve a file you would need to recompose it from the fragments on S3, which is a bit crap.

The obvious approach would be to associate one key with each object you put onto S3. This has the advantage of making each file addressable by URL, rather than requiring a reconstruction of the file from S3 fragments. Then when you sync your filesystem to S3, you do a comparison of the MD5 checksum of the local file to the MD5 checksum of the remote file. This could be pretty painful, though, and take forever: one call to S3 to get the target checksum, a checksum on the local file, then transfer of the whole file up to S3 if it has changed. This could perhaps be streamlined: instead of requesting the object using GET, you could use a HEAD request, which just gets the metadata; and perhaps the client could send a If-None-Match header with the request, passing the MD5 checksum as the value for the header - in this case, if the MD5 checksum in the request matches the checksum on S3, S3 will return a 304 response code (not modified), so we know the object hasn't changed and we don't need to parse the MD5 checksum out of the response. This could save a few cycles.

An alternative might be to put a piece of "file modified" metadata onto S3 (not the same as the date on the S3 resource, but a copy of the file modification time of the local file when it was transferred). Then just compare the local file modification time to the S3 metadata when the algorithm is deciding whether a file needs to be transferred. The file stat to get the modification time is likely to be far faster than an MD5 hash.

Yet another approach would be to just keep track locally of file paths and modification times (e.g. in a database) from the last time they were sent to S3. I think this is what JungleDisk does. Anything that has been added/removed/changed will be transferred without needing to reference S3 at all, so no expensive network operations. However, this will only work if S3 is only being sync'd from one location, and won't work if you are trying to sync from multiple locations. Is this enough, or should local files always be compared to S3 to determine whether they should be transferred? Maybe a local database of file modification times is enough?

One more idea: perhaps you could combine the database of local file modification times with a cache of MD5 checksums for those local files. You could then:

  1. Find any new files and transfer them to S3, while their checksum is generated and stored in the local database in the background
  2. Find any files which have changed, generate new checksums, cache them, and then compare those generated checksums to S3 checksums: the file gets transferred if its checksum differs from the one on S3
  3. Determine which local files have disappeared and optionally remove them from S3; then delete their cached MD5 checksum from the database
  4. Any files which haven't changed can have their checksum compared to the S3 resource

This has the advantage of some local caching, but isn't dependent on it; so you could use this approach to sync multiple machines to a single S3 bucket (unlike the database-only method). And the local cache could be optional, so you could use this approach from any machine, even if it was unable to do the caching (though with the ubiquity of file-based databases like SQLite it's unlikely you'll be working on a system with no database).

You could use this to do two-way synchronisation too, potentially. So if a file hasn't changed locally, you could get the client to compare to the S3 version, and fetch that to replace the local file if they are different. Though I think I will be concentrating on doing it in one direction first.

Ruby Tuesday: Rails West Midlands

I run a couple of Rails training courses at OpenAdvantage in the West Midlands, and thought it would be good if "alumni" had a way to stay in touch. While there are several good Rails mailing lists, I thought it might be nice to have one which is:

  1. West Midlands, UK-focused (much of the UK Rails stuff in London-focused)
  2. Intended for new Rails programmers (e.g. people who've been on OpenAdvantage courses)

Jono set up a similar group for PHP in the West Midlands which has taken off very successfully, and I wanted to follow his approach: low key, informal, a place to discuss Rails and Rails-related topics. I didn't want to put up another Rails howto website or blog, rather provide somewhere where Rails enthusiasts in the West Midlands could meet each other.

So, I bought and set up a mailing list there. We've had a few posts so far, but it would be good to get some more people involved. It's not restricted to the West Midlands, of course, but the focus will be on that locality. Feel free to join if you're interested.


I don't know, I work my fingers to the bone writing FlickrLilli, and FlickrStorm gets the TechCrunching. Maybe if I'd got round to doing the "favourites" area (where you can bookmark images you like) I'd be toast of the month. Ho hum.

And once again...

Updated AxleGrease again, to Rails 1.1.6.

AxleGrease 0.6.1

Had to release a new version of AxleGrease to keep up with the everso severe security warning released by the Rails team yesterday. Upgraded to Rails 1.1.5.

Ruby Tuesday: announcing AxleGrease 0.6 (was ROROX)

I got fed up with the name "ROROX" for my Linux Rails package (Ruby on Rails on XAMPP). So I have now changed its name to AxleGrease. The project is still hosted on RubyForge under the old name, though.

To accompany this "exciting" name change, I am pleased to announce the new version, which brings the stack bang up to date. You can get the latest version from RubyForge. Full details are in the CHANGELOG, but here are the highlights:

  1. Added a whole load of new goodies: gruff, rspec, rmagick, libxml, mongrel, fakeweb, hpricot, gem_plugin, daemons, mongrel_cluster
  2. Upgraded everything in the old version (Rails, Builder, Rubygems) to latest versions
  3. Updated the included scripts to automate configuration of existing or new Rails applications, so that they work with XAMPP's Apache server; this means using mod_scgi for SCGI, and mod_proxy for Mongrel; you can mongrelise your applications with a simple:
./ mongrel /path/to/app app_name <port>

or de-mongrelise with:

./ /path/to/app app_name

As ever, it is designed for my environment: XAMPP 1.5.3a running on Ubuntu Dapper. It may work on other Linux distributions, but I've made no effort to test it. I might do eventually. Using XAMPP means I can also integrate the start/stop scripts into the XAMPP start/stop sequence, and can do:

/opt/lampp/lampp stopmyapp
/opt/lampp/lampp startmyapp
/opt/lampp/lampp restartmyapp

to control my Rails applications.

My approach to configuring Rails applications also goes against the grain of other packaged systems: rather than creating new virtual host configurations, I mount individual applications on directories within Apache's document root. This means no need to edit /etc/hosts and simpler setup. However, this is somewhere where I'd like some feedback, as I'm not a mod_rewrite or mod_proxy expert. Currently, Mongrel/Apache configuration looks like this for a typical application available at http://localhost/emaildir/:

ProxyPreserveHost Off
ProxyRequests On
# so redirects work properly
ProxyPassReverse /emaildir/

RewriteEngine On
# redirect requests for dispatcher to the application root
RewriteRule ^/emaildir/dispatch(.*)$ /emaildir/ [R]
# requests for the root get redirected to index.html if it exists
RewriteCond /opt/lampp/htdocs/emaildir/index.html -f
RewriteRule ^/emaildir/$ /emaildir/index.html [R]
# any requests which can't be served from static files go to the proxy
RewriteCond /opt/lampp/htdocs%{REQUEST_URI} !-f
RewriteRule ^/emaildir/(.*)$$0 [P,QSA,L]

For this to work, you also need a line at the end of config/enviroment.rb like this:

ActionController::AbstractRequest.relative_url_root = '/emaildir'

(My scripts automate all this, by the way; and the remove script undoes any changes made. This is likely to work better with later versions of Rails which are being patched to work nice with Mongrel.)

It has taken me some time to get this straight. About 4 hours, to be precise. So hopefully it should work OK. Fingers crossed.

Syndicate content