elliot's blog

Are "jobs" just disguised RPC in a RESTful application?

In a RESTful web application design, you typically first identify the resources in your application, the nouns. For example, imagine you're writing a library app., and you're working on adding items to a catalogue. So you've got catalogues and items as your nouns.

You then decide to implement the operations on items, which live in the catalogue. In REST, HTTP verbs map onto the operations you want to perform: POST = create a new resource when you don't know what its identifier should be; PUT = update; DELETE = delete; GET = query resources. So you might end up with:

HTTP requestOperation performed on resourceReturns
GET /catalogue/items?term=potterRetrieve items containing the term "potter"Representation of items, with a 200 OK status code
POST /catalogue/items
Request body contains representation of new item
Add a new item to the catalogue201 Created status code, with Location header set to URI of new resource, and representation of resource in response body
PUT /catalogue/items/<control number>
Request body contains representation of updated item
Replace existing representation with an updated one200 OK status code
DELETE /catalogue/items/<control number>Remove item at the specified location204 No Content status code

Fairly typical REST.

Then you realise you want to upload a whole pile of items at once, embedded in a single request for efficiency; but you don't want the client to have to wait while the items are inserted into the catalogue and properly indexed etc.. Maybe it will take 5 minutes or something, and you don't want to leave a web client hanging. Or perhaps you want to upload only a single item, but once items are uploaded they are put into a queue for processing by another system, so there's a wait.

What are your choices? Here are some ideas, partly gleaned from the RESTful Web Services book:

  • Don't allow bulk uploads. You can only upload one item at a time, and you just have to wait until the operation completes and you get your proper status code back. Not really a solution, though.
  • Allow bulk uploads, process the items, and return a multi-status 207 code with the response. You still have to wait for all the processing to finish, but the response body contains a list of response codes and status reports, one for each uploaded item. Again, you have to wait for the upload to finish before you can return anything.
  • Allow a bulk upload, but return a 202 Accepted status, spawning asynchronous jobs to do the processing in the background. The response body can contain the URIs for each uploaded item. Each item then has a status which can be queried by asking for the resource again. For example, when an item is first uploaded, you get its URI back as a Location; when you GET that, the object has its status set to Inactive or Under processing or something. When processing is complete, you get a status like Complete on the item instead. The down-side is that your resource representations are polluted with status information, which could be good or bad.
  • As above, but your request to /catalogue/items returns a 202 along with a handle to a "transaction" or "job" resource which wraps the resources you uploaded. Effectively, you treat the upload itself as a resource with a status you can query; the resources attached to that resource don't have to be polluted with status information, but you perhaps lose the granularity of individual status codes on resources. Or maybe you produce one transaction resource per uploaded resource?

However, what I'm not so keen on is the idea of a job or service being a resource. Why? Well, if I want to create items in my catalog, I don't want to wrap them in a job and post them to /jobs; if I want to query my items, I don't want to have to go to a query service at /services/query or similar.

What these paths hint at to me is that an operation is being represented by that path, rather than a resource: effectively, calling them is like doing RPC: you pass the resources you want to act on as arguments to the procedure you're calling. Often, there's also some implicit resource hidden away behind the job or service. Compare:

  • GET /catalog/items?term=potter: the catalogue is visible, and we know we're querying items within it
  • GET /services/query?term=potter: here there's an implicit catalogue and its items behind the service; effectively, these objects are passed invisibly to the procedure we're calling; also what we're querying is not explicit


  • POST /catalog/items: we're appending a new item to the catalogue; we can infer that our new item will then be available at /catalog/items/<some identifier>
  • POST /jobs: job is an amorphous category, and we could post pretty much any type of resource into it; and there aren't any hints from the API about how to get at those resources once we've posted them

It's kind of like the difference between object-oriented design (REST) and procedural design (RPC). While a job might look like a resource, my opinion is that it's really an amorphous wrapper around the real resource you should be representing. Typically, jobs get introduced to cope with asynchronous updates; I'd prefer to see asynchronous operations occurring on proper resources, but exposed using the batch processing approaches outlined above. Otherwise I fear you might lose your resources inside some vague blob of a "job" or "service".

Neats vs. scruffies

I did my Ph.D. in artificial intelligence, so was interested to read a few Wikipedia articles about it. One distinction I'd never heard of was neats vs. scruffies in the field.

I put myself in the scruffies camp, probably, though I always had a yen for predicate logic and formal grammars. To my mind, some of the AI scruffies weren't scruffy enough, and tried to model human intelligence without any reference to psychological data. I tried to redress the balance a bit, and compared my program's output with psychological data on human inference during story comprehension. You can read all about it here.

At the time I did my Ph.D., I was pretty unfashionable, as I was researching symbolic AI approaches, while everyone around me seemed to be doing neural networks. However, I thought that while sub-symbolic approaches might produce intelligent output, I struggled to see how that would lead to a description of the solution, or anything that might be built on or added to by humans. If you're trying to program a reasoning system, for example, is it enough to train a neural network to create associations, or do you need to write something which can reflect on the process by which it reached its solutions? Neural nets are great for recognition tasks, but I was never convinced they were suitable for reflecting on how they completed the task. I'm sure there are plenty of counter-arguments to my limited opinion, so feel free to enlighten me.

The exciting things I've done

What I've been up to recently, tech first:

  • I've been coordinating (in the loosest sense of the word) a project at Talis to build a library-specific layer (written in Java) over the Talis Platform. It's the first project I've coordinated which involves several other people, and it's been challenging but worthwhile to do. We try to follow agile practices, and are currently doing mini-sprints, a week at a time with planning on Monday, using a traditional story board plus Jira for issue tracking through the week. We do some pair programming, which has been really good fun (initially I was a bit daunted by it, but the team is very supportive of each other and includes lots of talented individuals with different strengths - I'm learning a lot).
    What we're doing: basically we store lots of RDF in the Platform, then our OPAC (Prism) fetches it through our library-specific layer, which talks to the platform, which returns stuff to the OPAC. So we've been putting functionality into our library-specific layer to support Prism. I've been keen to make sure everything is solid, so we've invested a lot of effort in a thorough suite of unit and functional tests. I've probably spent about 2 weeks out of the last three doing testing.
    To be honest, the whole testing terminology confuses me; but I have come across a useful continuum to describe different types of testing:
    • White box testing covers unit testing, testing individual components in isolation with an awareness of the application's internal structure. We do a lot of this in development, mainly using JUnit and EasyMock, to test pretty much every class and its methods.
    • Black box testing treats the application as something you put stuff into and get stuff out of; it doesn't require any knowledge of the internals, just the public API. For this, we've been using Canoo WebTest. I half like WebTest: it's marginally better than writing Java code to do tests, and does mean that you can write the tests without having to be a Java programmer (providing you understand Ant a bit). I'm still not sure it's quite right, but it just about does the job. We use this to send HTTP requests to a running instance of the application, and verify aspects of the responses (e.g. status codes, XML tags, text). Because we're testing a RESTful app., we don't really need something like Selenium, which pretends to be a human user: we just want to send one-off requests and check responses.
    • Grey box testing is in-between the two above. Basically, I think of it as testing against real objects which the application has to integrate with (e.g. databases, network sockets, threads, start/stop scripts, filesystem). While you can mock a lot of this in unit tests, this isn't a great replacement for testing against the real thing. Our tests in this region do things like checking whether files get moved around properly by certain processes and checking whether the web interface starts and responds correctly. We've been doing this with JUnit, the Apache HttpClient, Jetty, custom Java code, etc.. This, to my mind, is the messiest sort of testing. We've also explored interesting ways to isolate these tests from the real platform, currently by creating our own mock HttpClient and HttpMethod extensions, which do the trick.
  • I've had a bit of time to think about Jangle, a more generic library API/HTTP proxy tool written in Ruby, as discussions about it have got more interesting recently. I've been using it as a generic HTTP proxy to sit between a client and a REST service, and also to try out some ideas about how to structure this type of application in a simple, flexible way. I've put together some playground code which is in no way tested, runnable or even intelligible, but is fun. I'm hopeful I'll get more time to work on it before too long.
  • Moving house. Sorting things out for that has taken up quite a bit of time, e.g. cleaning the house, sorting out the garden a bit, ringing people up, sending email.
  • Child care. Nicola has had a nasty stomach bug, so I spent most of the weekend and Monday looking after Madeleine. These days we end up playing Uno a lot (she is very good at it), pretending to be on treasure hunts, dancing (Hot Chip, Cabaret Voltaire and "The King of the Swingers" have been recent faves), and going to the park. Madeleine is quite argumentative at the moment; Nicola says it's because she's a Smith, and that we're all like it (Chloe, are you reading this?).
    For instance, we spent a good 10 minutes a couple of days ago arguing about her Fifi and the Flowertots plate, with Madeleine claiming that Poppy is holding a melon, while I pointed out it must be a gooseberry as the nearby strawberry would otherwise be the same size as a melon. Madeleine claimed it must be a small melon and/or giant strawberry. (Here is a picture of the plate.) Then I stopped myself, realising that I'm a full-grown adult and she's only 4. And that I was simply being petty. That's what Nicola's talking about.

Presentation at Coventry University

I did a presentation on Rails to some students at Coventry University tonight, as part of their e-commerce M.Sc. course. Here are the materials (introductory presentation on Rails and a script for a demo. of Rails functionality).

Memory of France by Paul Celan

A poem I come back to all the time. Superficially simple, but with such clear, painstaking and vivid imagery; I love the gentle falling rhythm in the lines towards the end of the poem, how each line pivots around its centre. Shame I can't really appreciate it in its original language (German).

Together with me recall: the sky of Paris, that giant autumn crocus...
We went shopping for hearts at the flower girl's booth:
they were blue and they opened up in the water.
It began to rain in our room,
and our neighbour came in, Monsieur Le Songe, a lean little man.
We played cards, I lost the irises of my eyes;
you lent me your hair, I lost it, he struck us down.
He left by the door, the rain followed him out.
We were dead and were able to breathe.

What do you enjoy?

How do you decide what you enjoy? What does "enjoy" mean, anyway?

Which leads me to the related questions: Why do we engage with art? What do we get out of it?

I am quite obsessed with this as an issue. I spend a huge amount of time and money on music. I love music. I can talk about my favourite bands all day. But my engagement with it is shallow, in terms of how well I can talk about it: I don't understand music in a technical way, even though I think it engages me both intellectually and emotionally. What am I getting out of it?

(I love dancing, by the way. You might not think that. I rarely go to clubs. But I absolutely love dancing. I used to go to loads of club nights at university and just immerse myself in the music, especially the beats and basslines. Now it's restricted to the kitchen while I'm washing up. I don't really like dancing at weddings or parties, mind.)

And recently I've got back into reading science fiction books. I love science fiction, too. Again, though, my engagement is shallow: I know about its history, general themes, I can recognise writers and styles, but I don't theorise deeply about its deeper meanings. I'm happy to just chat about it. (Though I did write a dissertation on JG Ballard and William Burroughs once.)

On the non-art side, I enjoy gardening a lot. I can lose myself when I'm out in the garden, pruning, weeding, sowing, just looking and smelling and hearing the garden around me. I think that is one of my purest enjoyments; I particularly enjoy it because I don't feel I have to intellectualise it. Maybe that's my problem with engaging with art, I think too much about it. Isn't that the point?

I've excluded familial relationships and friendships from this monologue: I don't so much enjoy these, as experience joy (and pain) because of them. Is this the same as enjoyment?

I've also excluded computers. As I work with them all the time, I have a love/hate relationship with them. I can't make a blanket statement that I enjoy programming. Often I don't. Sometimes I find it intensely frustrating and limiting. Other times I find it mind numbing. Then other times I get so engrossed that hours pass without me moving to eat or go to the toilet. It's one of those things.

I suppose what I'm getting at is: I don't feel I have a deep love for any hobby or pastime. Perhaps rather than "love" I should say I don't feel consumed by anything, or driven, or ambitious, or even that passionate. Should I be? Does it matter if I'm not? Is it just the time of year? Maybe I watch people on TV culture shows too much and think I should be able to intelligently discuss my experience of the world, the same way the participants on those programmes do. Perhaps I should just relax.

What do you enjoy, by the way?

(Now I've written all this, I'm not even sure why I did it. Perhaps it's because I've spent the whole day listening to Orchestral Manoeuvres in the Dark, a band I love and have done for 20 odd years. It made me wonder what it is about music that I like so much. And why I place so much importance on a person's music taste as a measure of their personality. I probably shouldn't, but I do.)


Quite interesting. Find out about your online presence.

Here's my QDOS.

Poem attributed to Han-shan (c. 9th Century)

My mind is like the autumn moon
Shining clean and clear in the green pool.
No, that is not a good comparison.
Tell me, how shall I explain?

Syndicate content