tech

Removing a password from a PDF on Linux

I just bought a PDF, legally, from a publisher's website. However, in their wisdom, they decided it would be a good idea to password protect all legally-purchased PDFs. This means that each time you open the PDF using Acrobat Reader, you have to remember and type in the password to read it. (Evince, the built-in PDF viewer in Linux, allows you to permanently save the password, but I tend to use Acrobat as it copes better with some PDFs.)

So, if you know the password for a PDF and want to remove it, you can use the command line tools pdftops and ps2pdf to free your PDF from its chains.

  • Install pdftops. On Ubuntu, you can do:
    apt-get install xpdf-utils
  • Install ps2pdf. I think this is already included with a default Ubuntu.
  • Convert the PDF to a postscript file, using the password:
    pdftops -upw <password> <file>.pdf
  • Convert the resulting postscript (which is now sans password) back to a PDF:
    ps2pdf <file>.ps

The only thing you lose are any PDF-specific features which don't translate to postscript, e.g. hyperlinks.

Remember, this only works if you know the password for the PDF: it doesn't break the PDF password for you.

Dell laptops with Ubuntu have arrived

Finally, you can now, in the UK, buy a laptop with Ubuntu pre-installed. This is fantastic news. I'll definitely be getting my next laptop there. Just imagine: a laptop with no hardware/Linux compatibility issues. How cool that would be?

I tried putting one together and managed to get 2048Mb RAM, dual core 1.73GHz Celeron processor, plus the basics (basic DVD/CDRW drive, no accidental damage cover, basic screen etc.) for £512. Not bad at all.

LUGRadio Live this weekend

LUGRadio Live 2007 is running tomorrow and Sunday. I can't make tomorrow (which is a shame, as most of the good talks seem to be on Saturday), but will be popping in for Sunday. See you there if you're going. If you're not going, I'd recommend it if you're at all interested in free and open source software: I've been to the previous two events, and thoroughly enjoyed both. This year (unlike last year) I'm not doing any talks or BOF sessions, so will be able to just have a good nose around and enjoy the talks. At £5 to get in, it's a snip.

Open source showcase report

The Open Source Showcase at OpenAdvantage on 20th June 2007 was a great event. I enjoyed putting it together (along with the other staff at OpenAdvantage, of course), and was pleased at the turnout (50 people) and the number of speakers we managed to get (13). Despite my chicken timer (to keep speakers to the 10 minute time limit) almost breaking, everything went swimmingly. Though it was a bit like a wedding, in that I was so busy trying to organise everybody that I didn't get to sit and enjoy it; I just hope the attendees and speakers found it fruitful.

I've written up a report on the event on the OpenAdvantage website.

Progress on Drupal Last.fm module

I had another issue with my Last.fm module the other day, which is why it's currently turned off. I think it happens if the Last.fm feeds are unavailable; causing the module's HTTP requests to time out; which in turn causes the whole of Drupal to time out as it waits for the response; which means my whole site falls over.

I've been using the drupal_http_request() function to run my HTTP requests, but unfortunately you can't adjust its timeout setting. So I dug around in that code, and have submitted a feature request and patch which enables you to customise the timeout when using this function. I then rewrote my module with shorter timeouts when making requests to Last.fm, which seems to do the trick.

Hopefully, if this patch makes it into Drupal, I will be able to release the new version of my module, complete with timeouts, so it won't cripple my site or anyone else's. It also adds a Last.fm recent tracks listing to user profiles (if they've set up their username) and has a block (only for one user - I just put it in for myself, really). It still needs a bit of work, and only handles recent tracks, but it's coming along fine.

Don't you Windows users get fed up?

I've switched my Windows machine on today, as my sister sent me a CD which won't open on my Linux machine, but inexplicably will on Windows. I'm currently copying a load of music tracks from it to my laptop. However, every 5 minutes or so, Windows pops up a dialogue box which says it has updated itself and needs to restart, with two buttons: Restart now and Restart later. It also has a countdown progress bar, which gives me 5 minutes to decide which I want to do: presumably if I left it, it would restart once the progress bar reached zero. Despite me pressing Restart later, the Window has popped up 3 times already. When I click Restart Later, I mean "later, when I've decided I want to restart"?! Not "I can't decide, ask me again in 5 minutes"!

Do you people out there using Windows find this enjoyable? How on earth do you stop yourself putting your fist through the monitor with the constant nagging?

By the way sis, if you're reading this, thanks for the MP3s.

Packt Open Source CMS Awards

I'm a judge on the panel of the Packt Open Source CMS Awards, in the category of Best Overall CMS.

Nominations open on 16th July 2007, so remember to twitch your voting fingers. There's a $5000 prize to the winner in the category I'm judging, plus prizes of $3000 for second place and $2000 for third. Other categories include Most Promising, Best PHP, Best Other, and Best Social Networking; but I'm not a judge for any of those.

(By the way, despite using Drupal myself, I will be scrupulously objective as always.)

Pro Drupal Development - Review

Pro Drupal Development (by John K. VanDyk and Matt Westgate, published by Apress; get it here) is a great little book. I know a bit about Drupal, and have written a couple of modules, but always felt like I was skirting the edge of some dark lake I dared not step into; when I did get in, I was quickly overwhelmed by the currents, got wet, and struggled out as soon as I had what I needed. (OK, a bit melodramatic, but you get the picture.) My knowledge was mostly gleaned from the handbooks on the Drupal site, which vary widely in quality, some being excellent and complete, others patchy and inaccurate, or for obsolete Drupal versions. I've also dug around in Drupal code a lot, but a clear understanding of the architecture continued to elude me.

This book, however, shines a clear strong light into Drupal's innards. I feel like it's written for someone like me: pretty technical, fairly able to make sensible inferences if given decent examples, and with some experience of Drupal of a user and dabbler. The chapters are pretty terse, but pack in some excellent code examples and fragments to help with common tasks. For example, I've been working a bit on my Last.fm module this week, and this book helped me to:

  • Work out how to use the caching system.
  • Modify the page which displays a user's account details to show data from my module.
  • Figure out how to extend an existing Drupal "object" (a user) with fields from my module's table.
  • Understand how to write HTML generation code which works nicely with the themes system.
  • Understand Drupal coding conventions a bit more and how to document my module.
  • Write a proper module installer/uninstaller.

And probably other stuff I've forgotten. Things I had been working out from the handbooks, by making inferences from other people's code, reading forum posts etc., were covered briefly, pleasantly and clearly. They've probably saved me a good few weeks of scrabbling around. The book covers most Drupal concepts with enough depth for you to get a good overview of the whole architecture, as well as giving you practical snippets (table of contents here).

Highly recommended, and a snip at $22.50 for the ebook. No Drupal developer should be without it! Especially useful if you're not a complete Drupal nut but want to be able to write modules properly.

Ruby HTTP clients revisited

A while ago I posted about Ruby HTTP clients, and how I'd been messing around with writing my own. As is so often the case with open source, I waited around long enough, and now the good solutions are floating to the surface. When writing some simple HTTP client stuff recently (to do spidering of a Rails application), I found that the following combination worked really well and meant I could dispense with my hoary old scripts:

  • The RFuzz HTTP client library, to fetch pages.
  • The Hpricot HTML/XML parser library for parsing the pages.

Installing them was as easy as:

gem install rfuzz
gem install hpricot

Though you're likely to need a lot of build tools installed, as they build native extensions.

Once they were in place, I could do stuff like this (to parse all the URLs out of an HTML page):

require 'rubygems'
require 'hpricot'
require 'rfuzz/client'

client = RFuzz::HttpClient.new('localhost', 4000)

# to fetch http://localhost:4000/people and get the response body
body = client.get('/people').http_body

# to parse the links out of the response body using XPath
doc = Hpricot(body)
links = doc.search('//a')

# to get the URLs out of the links
urls = links.map { |l| l.attributes['href'] }

etc.. Pretty good. I think my quest is over.

lolcats

Here's something I hadn't seen given a name before: the practice of putting captions with poor grammar and IM-style spelling onto pictures of cats to humorous effect. The resulting pictures are referred to as lolcats. Apparently, it is quite profound and clever.

The site I Can Has Cheezburger has many examples of the form. There's even a programming language, which uses lolcat style idioms, e.g. (explanation of keywords etc.):

HAI
CAN HAS STDIO?
I HAS A VAR
GIMMEH VAR
IZ VAR BIGGER THAN 10?
  YARLY
    BTW this is true
    VISIBLE "BIG NUMBER!"
  NOWAI
    BTW this is false
    VISIBLE "LITTLE NUMBER!"
  KTHX
KTHXBYE

Well I never.

Syndicate content