A few years ago, I used to work on RedHat Linux, then Fedora: my first Linux machine ran RedHat Linux. Then I got sick of things breaking every time I upgraded (on one traumatic occasion, my audio stopped working after a Fedora upgrade) and switched to Ubuntu.
But in my new job, I need Linux, but also need to encrypt my laptop's hard drive. I had a go at doing this with Ubuntu, with disastrous results. Didn't really know what I was doing (I did manage to encrypt the hard drive, but couldn't figure out how to mount it during boot). Fedora makes this simple, though. So I moved across, and have to say I'm pretty glad I did. It's very similar to Ubuntu in many ways, and I don't feel like I'm missing anything. My main annoyance was SELinux, which just wanted to stop me having any kind of fun with my computer; so I switched it off.
So, if you need to PGP encrypt your hard drive, but currently use Ubuntu, have a look at Fedora instead.
I have an EeePC with Moblin 2.1 (release candidate) installed on it (via the packages in http://repo.moblin.org/moblin/releases/2.1/ia32/os/). By default, you don't get an mp3 decoder with this distribution. It wasn't obvious (when I googled for it) how to install the right GStreamer plugins: several blog entries suggested a variety of pretty hacky methods and work-arounds.
Then I chanced on a blog entry which mentioned that Fluendo have a free legally-downloadable RPM of their own mp3 encoder; I remember someone at work mentioning it too. I downloaded and installed this onto my netbook with no difficulties. You have to register for an account and go through a faux shopping basket to get it, but it is fairly painless. Here's the product page URL:
Above is the first movie I've made with xtranormal, called Ex-super villain first date. Not sure how I've never come across this site before, but it is a fantastic piece of web engineering: an in-browser (Flash), animated movie creation application. Basic accounts are free, and give you a pretty good range of stuff you can do. The interface is very simple and intuitive. It seems best suited to the sort of stilted absurdist style I attempted above, which took me maybe an hour to put together. Much fun.
(Just a quick note about JQuery and rdfquery and memory leaks and htmlunit and such like for lost souls googling desperately for clues.)
I was having issues with some Webdriver tests, running under the htmlunit driver: they worked locally, but ran into Java "out of memory: heap space exhaustion" issues when running as part of a Hudson build on our Linux build server. The build also worked on another virtual machine with the Internet Explorer driver, so it seemed to be something specific to the htmlunit driver (maybe). After upgrading Java, upgrading Hudson (the continuous integration server), and tweaking the Java heap size, and still no joy, I found a few mentions of memory leaks in relation to htmlunit and JQuery (the Javascript library we're using), which I decided were worth investigating.
One comment in particular seemed plausible: that the Javascript on the pages under test was causing memory leaks in the Javascript engine in htmlunit (Rhino, I think?). It seemed possible that how we were using JQuery was causing Javascript to leak memory, but only in the context of the htmlunit driver. I trawled around the various bug trackers for htmlunit and webdriver, and got a few ideas (though couldn't pin the issue down enough to raise a bug myself).
One suggested fix (in the general context of JQuery, rather than specifically with regard to htmlunit) was to ensure that any event handlers bound to DOM elements via JQuery should explicitly be removed when the page unloads. This was easy to implement: I just added a final event binding to each page which looked like this (inside a <script> tag, obviously):
$(document).unload(function() {$('*').unbind(); });
I was also using the short-hand JQuery event binding syntax in my code, i.e. things like this:
$('#save').click(function() {
// ...
});
I changed these to the long-hand form (as this issue only started appearing when I started using the short-hand version):
$('#save').bind('click', function() {
// ...
});
One last thing I did was ensure that I called the quit() method on any Webdriver driver instances after I'd finished with them; and closeAllWindows() on any htmlunit WebClient instances when finished with them.
Unfortunately, I didn't do this very scientifically, and made all these changes at once. But the end result was that the build started running again. So if you're having out of memory errors with Webdriver/htmlunit/Hudson/JQuery/rdfquery, you at least have somewhere to start from :)

make "sky 1 make "into 2 to understand load "me clean home see home see obscure repeat :sky + :into [ see ] end to see television setscrunch :into golden :sky repeat often [ right pan forward sky ] end to golden :offering output :into * :offering end to television make "me happy end to happy op "erate "the "switches end to often make "trouble :sky + :into + :sky output power (:into + :sky) :trouble end to pan output :sky end to obscure hideturtle end
This is a Logo program which is also (kind of) a piece of poetry and image generator (it creates the image shown above). I did it a while back and found it an interesting exercise. I thought you might also find it interesting. Have you done anything similar? Anyone?
I'm not going to try to explain RDF and/or RDFa here, but thought any poor suckers looking for RDFa examples might benefit from me posting what I finally worked out, with help from my colleague Rob. Namely, how to annotate an HTML ordered list (<ol>) with RDFa attributes; and how to put RDFa attributes onto form elements.
Here's the HTML page with RDFa embedded in it. What I'm representing here is a sequence of collections, and the individual collections within it:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Collections</title>
</head>
<body>
<h1>Collections</h1>
<form method="post" action="http://receptacular.org/collections">
<ol xmlns="http://www.w3.org/1999/xhtml" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:rec="http://receptacular.org/schema#" typeof="rdf:Seq" about="http://receptacular.org/collections">
<li rel="rdf:_1" resource="http://receptacular.org/collections/1">
<span style="display:none;" rel="rdf:type" resource="http://receptacular.org/schema#Collection"></span>
<div class="collection-label" property="rdfs:label">Vague Collection</div>
<input type="checkbox" id="collections-1-hidden" property="rec:hidden" datatype="xsd:boolean" content="false"/>
<label for="collections-1-hidden">hidden</label>
<input type="checkbox" id="collections-1-defaultSearch" property="rec:defaultSearch" datatype="xsd:boolean" content="false"/>
<label for="collections-1-defaultSearch">use for searches</label>
</li>
<li rel="rdf:_2" resource="http://receptacular.org/collections/2">
<span style="display:none;" rel="rdf:type" resource="http://receptacular.org/schema#Collection"></span>
<div class="collection-label" property="rdfs:label">Archive Collection</div>
<input type="checkbox" id="collections-2-hidden" property="rec:hidden" datatype="xsd:boolean" content="false"/>
<label for="collections-2-hidden">hidden</label>
<input type="checkbox" id="collections-2-defaultSearch" property="rec:defaultSearch" datatype="xsd:boolean" content="false"/>
<label for="collections-2-defaultSearch">use for searches</label>
</li>
<li rel="rdf:_3" resource="http://receptacular.org/collections/3">
<span style="display:none;" rel="rdf:type" resource="http://receptacular.org/schema#Collection"></span>
<div class="collection-label" property="rdfs:label">Main Collection</div>
<input type="checkbox" id="collections-3-hidden" property="rec:hidden" datatype="xsd:boolean" content="true" checked="checked"/>
<label for="collections-3-hidden">hidden</label>
<input type="checkbox" id="collections-3-defaultSearch" property="rec:defaultSearch" datatype="xsd:boolean" content="true" checked="checked"/>
<label for="collections-3-defaultSearch">use for searches</label>
</li>
</ol>
<p>
<input type="button" value="Save" id="save-collections"/>
</p>
</form>
</body>
</html>
Available online here: http://receptacular.org/collections
Things of note:
To see the RDF which can be extracted from this page, you can use the W3C's RDFa Distiller. Here's the resulting RDF:
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
xmlns:dist="http://www.w3.org/2007/08/pyRdfa/distiller#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:rec="http://receptacular.org/schema#"
xmlns:xhv="http://www.w3.org/1999/xhtml/vocab#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
>
<rdf:Seq rdf:about="http://receptacular.org/collections">
<rdf:_1>
<rec:Collection rdf:about="http://receptacular.org/collections/1">
<rec:hidden rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</rec:hidden>
<rec:defaultSearch rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</rec:defaultSearch>
<rdfs:label>Vague Collection</rdfs:label>
</rec:Collection>
</rdf:_1>
<rdf:_2>
<rec:Collection rdf:about="http://receptacular.org/collections/2">
<rec:hidden rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</rec:hidden>
<rec:defaultSearch rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</rec:defaultSearch>
<rdfs:label>Archive Collection</rdfs:label>
</rec:Collection>
</rdf:_2>
<rdf:_3>
<rec:Collection rdf:about="http://receptacular.org/collections/3">
<rec:hidden rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</rec:hidden>
<rec:defaultSearch rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</rec:defaultSearch>
<rdfs:label>Main Collection</rdfs:label>
</rec:Collection>
</rdf:_3>
</rdf:Seq>
</rdf:RDF>
Saving changes to an RDFa-enabled form like this is another challenge, for which we used rdfquery, and RDF library for JQuery. (I recommend you use the latest svn HEAD version of this library, as older versions have a bug where they ignore RDFa elements nested inside elements without RDFa attributes.) Maybe I'll get round to that another time.
This is a Ruby script which randomly copies mp3 files from one directory to an mp3 player. I wrote it so I could easily fill up my mp3 player from the 9000 odd mp3s I have on a different external drive.
To run it, you'll need the sys-filesystem gem (see http://rubyforge.org/projects/sysutils):
$ gem install sys-filesystem
Next, edit these variables in the script (near the top):
* source_dir to the directory containing the mp3s you want to select from
* dest_dir to the path for the directory on your mp3 player you want to copy to
Be a bit careful, as this will attempt to fill the dest_dir you specify with mp3 files from source_dir. You might end up filling the wrong disk up.
Then just run it with ruby from the command line:
$ ruby mp3s_random.rb
Note that it won't delete anything from the destination drive, and will attempt to fill all the space available. Also note that it doesn't keep trying mp3s until it finds one which will fit the last remaining space: once it tries to copy a file which won't fit, it stops. You can always run it again to see whether the next run finds a file small enough to fit.
I've only tested it on Linux, but, who knows, it might work on Windows too. (No operating-system specific commands are used in the script, as it uses Ruby for all file operations.)
The code is below, but I've attached it as well.
require 'rubygems'
require 'sys/filesystem'
require 'ftools'
source_dir = '/media/disk/music'
dest_dir = '/media/disk-1/music'
files = Dir[File.join(source_dir, '/**/*.mp3')]
num_files = files.size
stat = Sys::Filesystem.stat(dest_dir)
disk_free_space_kb = (stat.blocks_free * stat.fragment_size).to_kb
files_selected = []
while disk_free_space_kb > 0 and num_files > 0 do
# choose an mp3
file_num_to_copy = rand(num_files - 1) + 1
file_path = files.delete_at(file_num_to_copy)
num_files = num_files - 1
# work out how big file is
file_size_kb = File.stat(file_path).size.to_kb
# subtract from free space
if (disk_free_space_kb - file_size_kb) > 0
files_selected << file_path
disk_free_space_kb = disk_free_space_kb - file_size_kb
else
break
end
end
files_selected.each do |f|
copy_to_path = File.join(dest_dir, File.basename(f))
puts "Copying #{f} to device"
File.copy(f, copy_to_path)
end
I work on a system at Talis which posts MARC records from customer library databases into a MARC to RDF transformer. The resulting RDF generated from the MARC is sent into the Talis Platform, where it's used to power Prism.
Over the last day or so I've been working on a bug which has prevented some records going correctly through this process. Along the way, I noticed another bug occurring somewhere between the post from the customer site into our MARC to RDF transformer. It looked as if line break characters in the original MARC record were being lost somewhere in the process. Consequently, when the MARC was pushed into the transformer, the record got spat out as invalid, as the length specified in the MARC leader didn't correspond to the length of the record (now it had lost its line break characters). (By the way, working directly with byte streams is the only way to work with MARC, for precisely this reason.)
I had a sudden insight on the way home, triggered by remembering issues I'd had with curl (the command line HTTP client) working on another personal project. On that project, I'd been trying to post RDF triples in ntriple format into my application using curl. However, the application only seemed to recognise the first RDF triple in the posted file. I couldn't understand why.
Then, when I echoed the body of the HTTP request, as received by my app from curl, I realised the issue: curl was sending the body of the request WITHOUT LINE BREAKS. As line break characters act as the delimiter between triples in RDF ntriple format, my app was only seeing a single RDF ntriple. When I tried an alternative tool to send the posts (the extremely useful Poster add-on for Firefox), the ntriples were received correctly.
Once I remembered this, I decided to do some debugging of the kind of requests curl would send if it were posting MARC records. My hypothesis was that curl was stripping line break characters from the MARC record (which is bad, as they are valid characters in MARC), and hence causing the record to be shorter than the leader said it should be.
First step was to put together something to echo and/or save HTTP request bodies. Rack is ideal for this sort of thing, so I used this little Rack web server program:
require 'rubygems'
require 'rack'
def save_body(body)
File.open('last_raw_request', 'w') {|f| f.write(body)}
body
end
Rack::Handler::WEBrick.run(lambda {|e| [200, {}, save_body(e['rack.input'].read)]}, :Port=>7777)
This saves the raw request body to a file called "last_raw_request".
I first posted a MARC file with line breaks in it (attached) using Poster (with Content-Type set to application/marc21) through Firefox. The MARC file came through intact and still valid.
I then posted a MARC file with line breaks in it using curl:
curl -d @marcfile.mrc -H "Content-Type:application/marc21" http://localhost:7777/
Which produced an invalid MARC file with line breaks missing.
The solution is to use the --data-binary switch when using curl to send binary data, which we're not doing when sending MARC from the customer site. Mostly this doesn't matter, but it does when the MARC record contains line break characters.
Namely:
curl --data-binary @marcfile.mrc -H "Content-Type:application/marc21" http://localhost:7777/
It's taken a while, but a feature request I logged 2 years ago has finally made it to Drupal trunk. (The basic idea was to put a timeout on Drupal HTTP requests to other systems, to prevent a whole Drupal site timing out if one of its requests to another site hung - prompted by working on AllConsuming and Last.fm modules for Drupal.) My original patch was promptly rejected, but it's been fascinating watching the discussion around the idea over the months, culminating in a well-rounded, properly-tested patch landing in CVS.
I've got a slightly unstable computer at the moment which I've been trying to diagnose. Still haven't worked out exactly what's wrong (it freezes randomly in both Windows and Linux), but I have found some useful testing tools on the way (for Ubuntu Intrepid Ibex unless otherwise stated).
Recent Ubuntu Linux distros include MemTest86+, a memory testing tool. You just select this option from the grub boot menu when your computer starts and it boots into a dedicated memory testing OS. The tests are fairly simple to get going, but taking hours, literally. You need to run them overnight.
The smartmontools package includes some testing tools for hard disks which have S.M.A.R.T. capability (most modern motherboards and hard disks support this). Once you've installed the package, you can use the smartctl command line tool to run diagnostics on your hard disks.
I used this tool like this:
$ sudo smartctl -t long /dev/sda
This starts the test, which will take a fair amount of time (mine took around 30 minutes for a 40Gb disk). Once it's finished, you can do:
$ sudo smartctl -H /dev/sda
to see the results. Mine looked like this:
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
A simple test to max out your CPU (and exercise your graphics card) is to run GLX gears:
$ glxgears -info
However, that doesn't really stress your system. For that, I used a tool called CPU Burn-in. This is ostensibly an overclockers tool, but what it does is attempt to push your CPU to maximum operating temperature so you can see whether it's stable. It's a binary download, so it's very easy to use, and has a Linux version. Unzip it, cd to the directory, and run:
./cpuburn-in 10
where 10 is the number of minutes you want to run the tests for. This one scared me a bit, as I watched the temperature of my CPU and system slowly climb. Read the caveats and warnings on the web site before running this tool.
While you're doing all this, you want to watch the system temperature etc. For this, you can use the xsensors tool. This is a simple apt-get on Ubuntu, but for some reason the default config. file is in the wrong place (it's called /etc/sensors3.conf but the app. is expecting /etc/sensors.conf). You can tell it where the config. file is using the -c switch, e.g.
./xsensors -c /etc/sensors3.conf
This displays a graphical readout of various system temperatures, fan speeds etc..
You can also watch various aspects of system usage by adding the hardware monitor applet to your panel (right-click on the panel and select Hardware Monitor). This lets you watch how much your CPU, memory, disks etc. are being utilised.