All posts by mark

Dealing with awkward subtitle problems in Handbrake

I know very little about Handbrake; this is just some notes on what I personally do to reduce my confusion about why subtitles aren’t being ripped properly and manually fixing that, but I almost certainly can’t answer any questions about issues you might be having! This is just here in the hope that it might be useful to someone…

I’ve been on a long-running project to rip some of our DVDs to network attached storage, so that playing them is a much more pleasant experience: we can then easily play any of our DVDs around the flat without suffering cute-but-unusable DVD menus, condescending anti-piracy warnings or trashy trailers. With some DVDs that are already quite poor quality I’ll just image the disk and copy the ISO to the NAS, but in most cases I use Handbrake to transcode just the titles I want from the DVD using a codec with rather better compression than MPEG-2. There’s inevitably a loss of quality doing this transcoding, of course, but in most cases I don’t mind.

DVD Subtitles

One slightly vexing issue that sometimes comes up when doing this is what to do about the subtitle tracks on the DVD. For plenty of films or TV shows there aren’t any points at which you’d expect a subtitle to be shown, so you don’t need to worry about them. However, plenty of our DVDs do have brief sections in foreign languages, for example, that you’d expect to be subtitled.

There are various choices about how you can handle these when ripping, but I prefer what’s really the option that loses most information: “burning in” any subtitles that are there for the purposes of translating foreign language sections into English or are added for comic effect. “Burning in” these images from a subtitle track means that you’re overlaying the images before the source video is transcoded, so you can’t remove those subtitles or switch them to other languages afterwards. Obviously lots of people prefer not to burn in the subtitles for that reason, but I tend to do this because it means I’m not subject to the vagaries of how different video players handle subtitles, and these rips are only for my own personal use anyway.

As well as “burning in” subtitles, another concept you might need to know about is of “forced subtitles”. Any subtitles in a given subtitle track might be marked as “forced” – this is used to tell the DVD player to display these even if the person hasn’t chosen to see all subtitles from a particular subtitle track – the intended use of this is exactly for brief sections of foreign languages in films, as I understand it.

In most cases, what works fine in Handbrake for what I want to do is to use the “Foreign Audio Search” option in the “Subtitle List” tab, selecting “Burned In” and then guessing whether to tick the “Forced Subtitles Only” box or not – generally I’ll leave that unchecked, unless when I look at the list of available subtitles (under “Add”) there’s only one English language subtitle track, in which case it’s probably forcing subtitles for any foreign language sections that are subtitled. This option should look through all the subtitle tracks for ones that appear less than 10% of the time, look for forced subtitles, and make a sensible choice about what to use: the Handbrake documentation explains this.

However, there are various ways this can go wrong – the “Foreign Audio Search” option sometimes makes the wrong choice in peculiar ways – e.g. I’ve seen it pick the right track when you’re just ripping one chapter from a title, but the wrong one when you’ve selected all the chapters (!). Also, there’s just very little consistency in how DVD producers choose whether to mark subtitles as forced.

When it goes wrong, here’s the method I use to manually pick the right subtitle track to burn in – essentially this is to do the “Foreign Audio Search” scan on just one chapter that I know both has audio that should and should not be subtitled, look through the “Activity Log” to see the results of that scan, and then manually select the right subtitle track based on that.


To step through that in more detail, here’s what I’d do:

  • Select from the “Subtitle List” tab the “Foreign Audio Search” option, and add that to the subtitles list. It doesn’t matter what options you choose, since we’re just adding this to get the results of the search into the activity log.
  • Find a chapter in the title you want to rip that has some audio you’d want to have subtitles for, and some that you don’t want to be subtitled. (You can select multiple chapters to achieve this if you want – the point of just choosing a small number of chapters is only to make the scan quicker.) I’d normally do this by (a) knowing a bit of the film with such audio and (b) finding that in bit of the film in Totem, which conveniently shows the title and chapter number in the window title.
  • Select just those chapters to rip,  add them to the queue
  • Start encoding
  • Open the activity log window
  • Once the “subtitle scan” pass has completed (it should be the first thing that Handbrake does) scroll up in the activity log to find the lines that look something like this:

[13:35:12] Subtitle track 0 (id 0x20bd) ‘English’: 87 hits (0 forced)
[13:35:12] Subtitle track 1 (id 0x21bd) ‘English’: 89 hits (0 forced)
[13:35:12] Subtitle track 2 (id 0x22bd) ‘English’: 6 hits (0 forced)

  • That means that subtitle track 2 is the right one, because there are subtitles for only some of the audio – the other two probably have subtitles for every line of dialogue, even if it’s in English. So, now we want to set up Handrake to rip the complete title but with that subtitle track added manually:
  • Remove the “Foreign Audio Search” option from the subtitle list.
  • Click “Add”  in the subtitle list tab, and select subtitle track 3 (n.b. not 2, since the graphical user interface (GUI) for Handbrake numbers subtitle tracks starting at 1, not 0.) Make sure you don’t select “Forced Subtitles Only” since the subtitles on that track aren’t forced (see “0 forced” in the output above).  (I would also select “Burn in” whenever adding a subtitle track manually like this, for the reasons discussed above – but you might well have different views about that.)
  • Then select all the chapters of the title, add the title to the queue and rip it as normal.

As other examples of things you might see in that list of “hits” in the results of the foreign audio search pass, consider the following:

[16:57:06] Subtitle track 0 (id 0x20bd) ‘English’: 77 hits (0 forced)
[16:57:06] Subtitle track 1 (id 0x21bd) ‘English’: 78 hits (0 forced)
[16:57:06] Subtitle track 14 (id 0x2ebd) ‘English’: 8 hits (8 forced)

In this example, it happens that the subtitles are “forced”, but it’s still clear that track 14 (0-based) is the one which just has the subtitles for the foreign language section, so I’d add track 15 (14 +1) manually in the GUI as above – and in this case it doesn’t matter whether you selected “Forced Subtitles Only” or not, since all of the subtitles on that track appear to be forced.

As a final example, you might see output like this:

[18:23:55] Subtitle track 0 (id 0x20bd) ‘English’: 92 hits (11 forced)

In that example there’s a single English subtitle track, which marks some subtitles as “forced” to indicate that those are for foreign language audio. In that case, in the GUI I would manually add subtitle track 1 (0 + 1) but would have to select the “Force Subtitles Only” option to avoid getting subtitles for everything.

Approximate UK postcode boundaries from the Voronoi diagram of ONSPD

TL;DR: you can try entering a postcode here and click through to see the very approximate boundaries of the postcode unit, sector, district and area around there.

This project started from me being curious about some simple questions – what does the boundary look like of all the houses with the same postcode as us? How much of our street does it cover? How much bigger would the boundary of your postcode be if you lived somewhere much more rural?

The problem with answering these questions properly in general (i.e. make it easy for anyone to find out) is that it would be incredibly expensive to do so. There are many underlying reasons for this, but essentially it comes to down to the fact that you need the latitude, longitude and postcode of every building in the UK. The only dataset which has this information is Ordnance Survey’s Address-Base product, which has wretched licensing terms. Even if I had £130,000 a year (pricing here) to spend on this, I wouldn’t be able to share the results with people due to the licensing, which is much of the point.

(Although the reasons I was interested in this originally are a bit frivolous, it really is a long-running scandal that this address database isn’t open – Sym wrote about one of the reasons why this matters on the Democracy Club blog and this case study about the Open Addresses project from the ODI gives you lots of good background about the problem, including the huge economic benefits for the country you could expect from this data being made open.)

Anyway, unfortunately, this means that I’ll have to settle for answering these questions imprecisely, using data that is open. A good starting point for this is the ONSPD (the Office of National Statistics Postcode Directory), which contains a CSV file with every postcode in the UK, and, for most of them, it has the latitude and longitude of the centroid of that postcode.

What I wanted to do, essentially, is to find, for each postcode centroid, the boundary of all the points that are closer to that centroid than the centroid of any other postcode. In mathematical terms, we want the Voronoi diagram of the postcode centroids, and we can calculate that with Python’s matplotlib by generating the Delaunay Triangulation of the points with matplotlib.delaunay. (Delaunay triangulation is a closely related geometric concept, from which you can derive the Voronoi diagram.)

That’s not the whole story, however, since we have to think about what happens around the edges of this cloud of postcode points. For example, here is the Voronoi diagram of just the postcode centroids in TD15 (Berwick-upon-Tweed):

The most obvious features there are probably the big spikes out to sea and to the south-east, but they actually aren’t anything to worry about: it’s just that the outermost postcode centroids around the coast at Berwick-upon-Tweed are concave, which produces big circumcircles from the Delaunay triangulation and so large triangles in the Voronoi diagram. Instead, the problem is the postcode centroid I’ve highlighted with a red arrow in the south. This point isn’t actually contained in any of the polygons in the generated Voronoi diagram. I don’t want to risk this happening around the edges of the cloud of postcode points, so before calculating the Voronoi diagram I’ve added 200 points in a very large diameter circle around the true postcode points. These “points at infinity” mean that each point around the edge is definitely contained in a polygon in the Voronoi diagram. For example, if we do the same with the Berwick-upon-Tweed example, you instead get a diagram like this:

I’ve highlighted the same postcode centroid with a red arrow, and you can see that this means it’s now contained in a polygon.

Here’s an example of the polygons you get for the postcodes in SW2 1 which also shows includes these “points at infinity”:

This does mean that when you run this process for the whole UK, you might end up with massive postcode polygons around the coasts, so the script checks if any of the polygons points might lie outside the boundary of the UK (taken from OpenStreetMap) and if so clips that polygon to that boundary. (That boundary is a little way out to sea—you can see it as the purple line in the picture above—but it’s the most readily available good boundary data for the whole country I could find.)

Another inconvenience we have to deal with is that there are multiple postcodes for a single point in some cases. (One of the most famous examples is the DVLA in Swansea, which has 11 postcodes for one building. That pales in comparison to the Royal Mail Birmingham Mail Centre, though, that appears to have 411 postcodes.) We can’t have duplicate points when constructing the Voronoi diagram, so the script preserves a mapping from point to postcodes so we can work out later which polygon corresponds to which postcodes.

One other thing that made this slightly more complicated is that the latitudes and longitudes in ONSPD are for the WGS 84 coordinate system¹ – if you generate the Voronoi diagram from these coordinates directly,  you end up with polygons that are elongated, since lines of longitude converge as you go further north and we’re far from the equator. To avoid this, the script transforms coordinates onto the Ordnance Survey National Grid before calculating the Voronoi diagram and transforms the coordinates back to WGS 84 before generating the KML file. This reduces that distortion a lot, although the grid of course is rather off for the western parts of Northern Ireland.

¹ Correction: Matthew pointed out that ONSPD does have columns with eastings and northings as well as WGS 84 coordinates, so I could have avoided the first transformation.

A last stage of post-processing is to take the shapes around these individual postcode centroids and agglomerate them into higher level groupings of postcodes, in particular:

  • The postcode area, e.g. EH
  • The postcode district, e.g. EH8
  • The postcode sector, e.g. EH8 9

It’s nice to be able to see these higher level areas too, so this extra step seemed worth doing, e.g. here are the postcode areas for the whole country:

Anyway, my script for generating the original postcode polygons takes a few hours (there are about 2.6 million postcodes in the ONSPD database), which could certainly be sped up a lot, but doesn’t bother me too much since I’d only really need to update it on a new release of ONSPD. And this was just a fun thing to do anyway. (It was meant to be a “quick hack”, but I can’t really call it that given the amount of time it’s taken.)

One small note about that is that I hadn’t really used Python’s multiprocessing library before, but it made it very easy to create a pool of as many worker processes as I have cores on my laptop to speed up the generation. This can be nicely combined with the tqdm package for generating progress meters, as you can see here.

The results are somehow very satisfying to me – seeing the tessellation of the polygons, particularly in areas with widely varying population density is rather beautiful.

Get this data

If you just want to look up a single postcode, you can put your postcode into a MapIt instance I set up with all these boundaries loaded into it.

That MapIt instance also provides a helpful API for getting the data – see the details on the front page.

If you just want an archive with all 1.7 million KML files, you can also download that.

Please let me know in the comments if you make anything fun with these!

Related projects


Of course, I’m not the only person to have done this. I found out after working on this on and off in my spare time for a while that there’s a company called Geolytix who have taken a similar approach to generating postcode sector boundaries, but who used real-world features (e.g. roads, railways) close to the inferred boundaries and adjusted the boundaries to match. There’s more about that in this blog post of theirs:

Postal Sector Boundaries by Geolytix

They’ve released those boundaries (down to the postcode sector level) as open data as a snapshot in 2012, but are charging for updates, as explained there.

The results look rather good! Personally, I’m more interested in the fine grained postcode boundaries (below the postcode sector level), which aren’t included in that output, but it’s nice to see this approach being taken a big step further than I have.

OS Code-Point Polygons

The Ordnance Survey, every civic tech developer’s favourite vampire squid, do sell postcode polygons that are as presumably as good as you can get. (It sounds from the Geolytix blog post above that they are generated as the Voronoi diagram of Address-Base points, which is what I’d ideally like to do myself – i.e. just run this script on all the addresses in Address-Base.) You can see that here:

Naturally, because it’s the Ordnance Survey, this product is also very expensive and has crappy licensing terms.

What’s next for this project?

I’m not sure if I’ll do any more on this, since it’s been much more of a time sink than I’d hoped already, but if I were to take it further then things I have in mind are:

  • Get a better coastline polygon for the UK. The OSM administrative boundary for the UK was very convenient to use, but because it extends out into the sea some distance, it makes areas around the coast bulge out and that means that you can’t really compare the areas of postcode shapes, which is one of the things I was interested in. You could create a better polygon as a union of some of the shapes in Boundary-Line, for example.
  • Adding other sources of postcode points, e.g. OpenAddresses – although the project is in hibernation, I hoped I’d be able to use the addresses they’d already collected, but the bulk downloads don’t seem to include coordinates. I might be missing something obvious, but I haven’t heard back from emailing them about that.
  • It would be nice to make a Leaflet-based viewer for scrolling around the country and exploring the data. (For viewing multiple areas I’ve been loading them into QGIS, but it would be nice to have a web-based alternative.)

Source code and recreating the data

Source code

You can clone the source for this project from:

If you want to generate this data yourself you can follow the instructions in the README.

When does Columbo first appear in each episode?

I made the graph below for a short talk I gave about the TV series “Columbo”, which I think is a marvellous programme. The point was to depict the typical structure of the show for people who didn’t know it: the substantial blue section of each bar (before Columbo first appears) usually just consists of the murderer going about the crime, so you know exactly how it was committed and by whom. The red section of the bar is the part where you see Columbo trying to figure everything out. (People sometimes say this is makes it a “howcatchem”, not a “whodunnit”.)

A graph showing when Columbo first appears in each episode
A graph showing when Columbo first appears in each episode. You can click through to a larger version in which you can see which episode is which.

Anyway, I thought it might be nice to put the graph online, which is most of the point of this post, but thought I might add a few notes about why I like Columbo so much.

I think Columbo suffers rather from so many people remembering its clichés (e.g. his basset hound (“Dog”), the cigars, the raincoat, “just one more thing”; his never-seen wife) but not necessarily the things that made it a powerful show.

For me perhaps the most important of those is the way that class, power and wealth are treated on the show. There’s a joke on a Columbo podcast I sometimes listen to that he must work in a special department of the police force where he only deals with crimes of the very wealthy and privileged. And indeed, the privilege of the murderers is frequently emphasized by Columbo; in one scene he literally tots up the value of someone’s house and possessions and works out with them how many years he’d have to work to afford them – and yet this is without bitterness, and he never drops the persona of the enthusiastic and respectful fan.

The house from "Étude in Black"
This is the kind of house a conductor lives in, in the world of Columbo

The awfulness of the murderers in Columbo is compounded by this dynamic; they consistently have an extraordinary sense of entitlement, believing they can get away with anything because of their position in society, and have absolutely no regard for the apparently down-at-heel detective. (This makes the inevitable point where they realise how badly they underestimated him all the sweeter, of course.)

I think this partly resonates with me so much because, while I’m undoubtedly incredibly fortunate in my situation in life, my job is to work on projects that try to increase the transparency of institutions that affect our lives and try to hold people in power to account. This is difficult, frustrating work, and it often seems (particularly in the political climate at the moment) that the successes are overwhelmed by the setbacks.

But the world of Columbo isn’t like that: it’s a fantasy world in which someone who appears powerless succeeds 100% of the time in holding the powerful to account – no matter how rich they are, or how well connected, or even if they’re his own deputy police commissioner. And he does that all while being respectful, polite and enthusiastic.¹

I could write a lot more about what makes the show important to me, why I admire the character of Columbo, and the various flaws I perceive in the show, but I’m trying to get some blog posts actually finished at the moment, so I’ll just leave you with some photos of Peter Falk throughout the 35 years that he played Columbo , and a list of the occupations of the murderers in the series.

¹ If I remember right, Zack Handlen made this point well on the Just One More Thing podcast.

With the help of Wikipedia I made this list of the occupations of the murderers on Columbo. (Note that quite a few of them are “[occupation] to the stars!” where there wouldn’t otherwise necessarily be the required difference in social status :))

  • Psychiatrist
  • Attorney
  • Best-selling Novelist
  • Head of a detective agency
  • Retired Major General in the Army
  • Art critic
  • Independently wealthy (heiress)
  • Chemist / heir to a chemical company
  • Architect
  • Orchestral Conductor
  • (Unclear – retired?)
  • General manager of an American football team
  • Famous stage actors
  • Movie star
  • Cardiac surgeon
  • Chess grandmaster
  • TV chef / banker
  • CEO of a cosmetics company
  • Head of a winery
  • Senatorial candidate
  • (Subliminal) Advertising expert
  • Book publisher
  • Director of a think tank
  • Famous singer
  • Deputy police commissioner
  • Owner of a chain of gyms / fitness expert
  • Photographer
  • Head of a military academy
  • Auto executive
  • President of an electronics company
  • Psychiatrist
  • Movie star
  • Diplomat
  • CIA agent
  • Famous matador
  • Magician and club owner
  • Heir to a boat-building firm (? Fred Draper)
  • TV detective
  • Museum curator (?) in the family museum
  • Accountant (senior partner)
  • Mystery writer
  • Restaurant critic
  • TV executive
  • Mind control guru
  • Arms dealer / poet / raconteur / terrorist
  • (Fake) Psychic
  • Hollywood film director
  • Sex therapist
  • Head of a paramilitary school for mercenaries / retired colonel / head of a think tank
  • Famous artist
  • Magazine publisher
  • Political operative (previously lawyer with the DA)
  • Real estate executive
  • “Dentist to the stars”
  • Gigolo
  • Spoilt frat boys
  • TV host
  • Lawyer
  • Jeweller
  • (No murder, but the kidnapper was an ambulance driver)
  • Gambler
  • Wealthy socialite
  • Radio host
  • Insurance investigator
  • Thoroughbred ranch owner
  • Crime scene investigator
  • “Funeral director to the stars”
  • Film composer and conductor
  • Nightclub promoter

Books (2014 to 2016 ish)

Someone (rather surprisingly!) mentioned they’d enjoyed my last post about books I’d been reading, and would be interested in another one, hence this post.

Lots of these were recommendations from people I know, which I always hugely appreciate, but I haven’t attempted to note who recommended what below, partly since I’m not sure people would be happy with those recommendations being public.

(The links below are affiliate links to Amazon UK, in case that concerns you. Although I don’t know why I bother, really – I think such links have made me about £2 in total over the last year.)

“A Little Life” by Hanya Yanagihara

This is a long, brilliantly written and deeply upsetting novel. I cried more during this book than any I can remember in a long time, and had to stop reading often.

I have some reservations about this book, but on the whole it was an incredible experience to read. You should be warned, though, that it has descriptions of horrifying abuse in it.

“Utopia for Realists” by Rutger Bregman

This is an argument for a series of radical progressive policy ideas, including:

  • Universal basic income
  • Shorter working weeks
  • Open borders

It’s a bit of a polemic; it doesn’t really address some of the issues with universal basic income, for example, such as that some people in society do need more support from the state from others, and how you address that. However, it’s thought-provoking and it’s a good source of references to places where these policies have been tried. (e.g. I didn’t know that Richard Nixon was close to passing something very like a universal basic income in the USA.)

“The Taming of the Queen” by Philippa Gregory

This is one of Philippa Gregory’s novels about the Tudor period, specifically about Kateryn Parr, the last wife of Henry VIII. I had mixed feelings about this; after reading Hilary Mantel, the writing seemed a bit flat, and the basal exposition in dialogue got repetitive. It’s very tense, though, and a fascinating story that I knew nothing about.

“Command and Control” by Eric Schlosser

I found this book about the safety of nuclear weapons since their earliest development completely gripping, and very alarming. It interleaves the broader history with a detailed account of one particular incident, which worked very well, I thought.

“Hooves Above the Waves” by Laura Clay

(Disclosure: Laura is a friend.) I really enjoyed these dark and absorbing short stories. They also feature some creatures from Scottish mythology which I used to read stories about as a child, but hadn’t thought about for a long time. I’m looking forward to her next works.

“Bad Vibes” by Luke Haines

Subtitled “Britpop and my part in its downfall”, this is a bitter, angry and entertaining account by Luke Haines (of The Auteurs) of the Britpop years, how he disliked almost everyone else in that scene and none of them understood his genius, etc. etc. It works quite well as a companion piece / antidote to “The Last Party: Britpop, Blair and the Demise of English Rock”. I definitely found it more funny than objectionable, probably because I think it’s clear that he knows he’s an arse.

“The Stranger” / “The Outsider” by Albert Camus

I’d never read this before, but I think it’s always intriguing reading a book for the first time that you know the reputation of through popular culture. I can see why it’s so highly regarded, but the idea that lots of young men (apparently?) identify with the protagonist does upset me.

“Life Moves Pretty Fast” by Hadley Freeman

A joyful celebration of 80s movies, and a great source of recommendations for interesting films from that decade that I missed, or are worth rewatching.

I can’t agree about how highly she rates some of the films: for example, I like Ghostbusters, but it’s nowhere close to being my favourite film, as it is for her. Also, I loathe almost everything about “Ferris Bueller’s Day Off”. But that doesn’t matter at all, since she presents interesting, thought-provoking cases for all the films and the writing’s funny and involving throughout.

“Reasons to Stay Alive” by Matt Haig

This is an account of the author’s experience of clinical depression and recovery. (I’ve been treated for depression myself in the past and still struggle with it, but I’ve never experienced anything so severe as him.) There was a lot that I related to there, and it’s a quick, easy read and surprisingly uplifting read.

“Ancillary Justice”, “Ancillary Sword” and “Ancillary Mercy” by Ann Leckie

This science-fiction trilogy (“space opera” I guess) has been rightly lauded; the world-building is fantastic and I loved the story. The way it plays with your perceptions of gender and appearance are fascinating as well.

“Into Thin Air” by Jon Krakauer

I read this after watching the film “Everest”. I found it very absorbing; I know a bit about mountaineering in the sense of “climbing Munros” but the tensions and contradictions of expensive guided climbs in the Himalayas, at altitudes where the human body is effectively dying, was largely new to me.

“The Girl on the Train” by Paula Hawkins

As far as plot goes, I found this a bit predictable, but it was certainly gripping and the unreliable-due-to-alcoholism narrator worked well.

“Good Omens” by Terry Pratchett and Neil Gaiman

One of many re-reads – it’s still such a gem of a book, which has some of my favourite jokes in it.

“Saints of the Shadow Bible” by Ian Rankin

So long as Ian Rankin keeps writing Rebus novels, I’m going to keep reading them. I remember this being up to the standard of the later books (i.e. good :))

“Deep Work” by Cal Newport

This is about how important it is professionally to be able to regularly get into that lovely state of deep concentration so you can work on hard problems, even if it’s for relatively short periods of time each day (e.g. 3 or 4 hours). It’s hard for me to evaluate this book, really, because it plays exactly to my prejudices about what constitutes worthwhile work, which several people have told me are a bit broken :) However, I found it inspiring—it’s OK to stand up for this!—and its practical suggestions for, say, making email less time-consuming, were useful. I’d recommend it.

“Age of Ambition” by Evan Osnos

A very well written book about contemporary China. I learned a lot from it, and it’s very engagingly written. Highly recommended.

“Mansfield Park” and “Northanger Abbey” by Jane Austen

These are the two Jane Austen novels I know least well – I think I read them as a teenager, but couldn’t remember much about either. As far as Mansfield Park goes, I found Fanny Price pretty unsympathetic up to the point where everyone’s putting appalling pressure on her to marry Henry Crawford, at which point I found myself cheering her on. Northanger Abbey I loved too – particularly the awkwardness of trying to get to know people in Bath at the beginning. They’re both brilliant, obviously :)

The Peter Wimsey / Harriet Vane series books by Dorothy Sayers

Lord Peter Wimsey shares with Psmith the disconcerting contradiction in my mind as a reader that we must accept that these are clearly attractive men despite both wearing monocles. Anyway, my introduction to Dorothy Sayers was from the Nine Tailors and Five Red Herrings, but I think my favourite novels so far are those featuring Harriet Vane, and the journey of their relationship improving from the cringingly awful start in Strong Poison. Gaudy Night, in particularly, is brilliant, and I found that novel and Busman’s Honeymoon very moving in particular.

“Willpower: Rediscovering our Greatest Strength” by Roy F. Baumeister and John Tierney

It’s a bit curious writing about this book, because I’d have been very positive about it except for the issue that the research described in this popular science book is at the centre of the reproducibility crisis in psychology. The material about drawing “bright lines” when you’re trying to give something up, and the parts about how low blood sugar can really affect your emotional state and ability to make decisions rang very true to me.

“The KLF: Chaos, Magic and the Band who Burned a Million Pounds” by John Higgs

A brilliant account of the story of the KLF (so far) but it’s really about far more than just what they’ve done – it goes into the art movements that inspired or (may) relate to the band, and different ways of interpreting their bizarre story. It’s also very funny – I laughed out loud a lot when reading this.

“Stranger Than We Can Imagine” by John Higgs

By the same author as the previous book, this is an audacious journey through the 20th century, looking at how art and culture changed. I enjoyed it, and admire both the attempt and the writing – it’s great fun to read.

Terry Pratchett – Discworld novels

I’ve lost track of which Discworld novels I’ve re-read over this period, but it certainly included the Witches series. Anyway, they’re always so pleasurable to read – I’m really glad I started reading them again after such a long gap.

“Master and Commander” by Patrick O’Brian

I know lots of people who are big fans of Patrick O’Brian’s Aubrey / Maturin novels, but I’d never read any before this. I found it took a little while to get used to the idea that it was fine to not understand all the nautical terms, and just figure out roughly what’s going on from context – this is very similar to lots of science fiction, in fact. (I should say that I know some people get just as much enjoyment out of understanding every detail of these books as well.) Anyway, I’m planning to read more of the series, but got a bit stuck, for reasons I’m not sure of, part way through “Post Captain”. I’m sure I’ll go back to it, though.

“Danny the Champion of the World” by Roald Dahl

I loved this as a child, and it’s still marvellous. Perhaps the biggest shift in getting older is that I thought the father was wonderful when reading it as a child, and as a 40-year-old my thoughts were that he was, despite having many great qualities, unbelievably irresponsible.

“Nation” by Terry Pratchett

A marvellous standalone Terry Pratchett novel, which is funny and delightful. I think probably the less said about it the better, because right from the start there are things that are surprising, so arguably might constitute spoilers :)

“Anno Dracula” by Kim Newman

I don’t normally read horror fiction, so this was a bit of a departure for me. It’s by Kim Newman (whom you might know better from his film criticism) and set in a Victorian London some time after a key event in Dracula went differently, and now about a third of the population are vampires and Dracula is the Prince Consort. It was certainly disturbing, but very imaginative and packed with fictional characters from the period which reminded me of Alan Moore’s excellent League of Extraordinary Gentlemen comics. It’s really good.

“Dracula” by Bram Stoker

I think I re-read this after Anno Dracula. It’s still excellent, of course, but for some reason the thing that jumped out at me this time is that Van Helsing’s treatment (random blood transfusions from one person to another) were very likely to harm the recipient given this was before blood types were known about.

“The Knot Shop Man”: “Air”, “Fire”, “Earth” and “Water”

(Disclaimer: the author is a friend and colleague.) An amazing set of four books that you can read in any order, despite having a single narrative thread running through them all. These are in the finest tradition of dark and fantastic stories for children (sorry: “smart children or thoughtful adults”) with a huge amount to enjoy. (If you’re interested in knots, then you’ll like it even more.)

“The Fourth Revolution” by John Micklethwait and Adrian Wooldridge

I didn’t quite know what to make of this; the history of what they consider the previous revolutions in types of government was interesting to me, as was the cataloguing of the changes the world is going through. The conclusions, though, seemed not only like just what you would expect from editors of The Economist, but also pretty unimaginative.

“Creativity, Inc” by Ed Catmull

I wish this was more about the history of Pixar, and trying less to be a management book. (The management stuff is pretty good, and has quite a few things that were certainly readily applicable to where I work, but it’s not really why I was reading it.) It is a good insight into Pixar as a company, though, if you’re a fan of them (as I am!) both on a cinematic and technical level. There’s quite a bit about the troubled development history of some of their most successful films (e.g. Toy Story 2) and how they decide what to do with projects that are seen as failing.

I describes “Inside Out” as a work in progress, and on reading that I thought, “how’s that ever going to work?” How wrong I was :)

“Jeeves and the Wedding Bells” by Sebastian Faulks

I was a bit sceptical about this – a Jeeves and Wooster novel written by someone other than Wodehouse! – but it’s really joyous, particularly in ways that I can’t talk about for fear of spoiling it for you :)

“Delusions of Gender” by Cordelia Fine

A very good book debunking the nonsense talked about the difference between men’s and women’s brains.

“The Paying Guests” by Sarah Waters

I think Sarah Waters is just a brilliant writer – I’ve read all of her novels, and this latest one is great too.

“Geek Sublime” by Vikram Chandra

Unfortunately I didn’t get on with this at all. I care a lot about the aesthetics of programming, obviously, but didn’t really relate to what the author was saying.

“The Way Inn” by Will Wiles

An excellent second novel from Will Wiles (disclaimer: whom I know a bit from college). It’s funny, painful, and not quite what you might expect from early on in the book. (It’s added a certain something to staying in chain hotels for me, which you’ll understand if you read it.)

“The Blunders of Our Governments” by Anthony King and Ivor Crewe

This book is in two halves – the chapters in the first half are each on a blunder which a UK government has made in recent history (e.g. the Poll Tax, Individual Learning Accounts, etc.) and the second half has more general analysis and suggestions. I think the first half is brilliant, and very professionally relevant to me and a huge proportion of the people I know who work in civic tech or GDS, say. The second half is less convincing, particularly when it touches on IT projects. But the case studies in the first half are compelling, partly because you will frequently cringe  on hearing some of the mistakes that well-meaning people have made.

“Dodger” by Terry Pratchett

More Terry Pratchett, in Dickensian mode (Dickens even appears as a character). Very enjoyable, as you’d expect, and I found the mystery / thriller aspect exciting.

“Coraline” by Neil Gaiman

I read this partly out of embarrassment that I hadn’t before, but also because a friend told me they didn’t enjoy “The Ocean at the End of the Lane” because it was so similar to Coraline. I disagree, I think – both stand on their own merits. They’re both brilliant stories – I wish I’d they’d been around for me to read when I was much younger. (I watched the animated film of Coraline later, which is a very nice adaptation.)

“23 Things They Don’t Tell You About Capitalism” by Ha-Joon Chang

This title rather sets up the obvious line that this included quite a lot of things that I have been told about capitalism :) It’s interesting, though, and a quick read with some interesting examples, and tackles lots of broken assumptions people make about economics.

“A Slip of the Keyboard” by Terry Pratchett

This is a collection of Terry Pratchett’s non-fiction writing, which has some real gems in it – e.g. his anger when writing about the danger of extinction of orangutans is very powerful, and (very different in tone) an essay about how our Father Christmas has clearly been swapped with one from another universe is brilliant. I read this shortly after Terry died, so a lot of this felt very poignant.

“The Life-Changing Magic of Tidying” / “Spark Joy” by Marie Kondo

These books, which are ostensibly about tidying up, but in large part about getting rid of your possessions, have a huge following. I’m a bit conflicted about these because they are bonkers, but they’ve been genuinely useful to me in prompting me to get on with (a) recycling or selling on things that I don’t really need, based on her test of “does it spark joy when you hold it?” and (b) her techniques for folding and storing clothes.

(When I say it’s “bonkers”, I mean, for example,  the suggestion that you give little pep talks to screwdrivers and other unglamorous but necessary possessions; thinking about how your socks feel being bundled into a drawer; the assertion that clothes that are worn closer to your heart being easier to feel affinity with, etc. etc. Still, it’s interesting to me that despite these things, I think these books have still had a very positive influence on my life.)

“Hatchet Job” by Mark Kermode

As a dedicated follower of the church of Wittertainment, I was predisposed to like this book by Mark Kermode about the practice of film criticism and its place in the world, and it didn’t disappoint me.

Christmas Gift Ideas

Holiday gift guides seem to be almost exclusively terrible, particularly those aimed at “geeks”, whatever people think that means. In particular these lists are often split by gender, presumably because they’re written by (or pandering to) idiots. And they’re typically full of novelties or cute ideas that in practice will just occupy valuable space and you’d feel guilty about giving away or recycling. This post is my attempt to write a list of ideas which I think has (mostly) genuinely useful presents on it, based on things we’ve owned and used for a while.

I’m quite conflicted about this exercise, I should probably say. Different families and social groups have very different present-giving cultures, but for many people in a similarly lucky position to me, getting consumer items that you don’t really want, or a subtly wrong variant of something you do want, is worse than getting nothing at all, and much worse than the person giving money to a charity instead. That said, maybe these lists are useful as a basis for things people might suggest and discuss before giving as presents?

(There are quite a few Amazon affiliate links here, which I haven’t tried using in a blog post before. I imagine no one much will read or click on links in this post anyway, but if that bothers you, you’ve been warned, at least.)

Muji touchscreen gloves

You can get gloves with conductive material in the fingertips from loads of places nowadays, in fact. The idea is that you can use capacitive touch-screens, like those on your phone or tablet, without removing  your gloves. These Muji ones are pretty cheap, and work OK – I find you need a bit more pressure than without the gloves to get them to work, but it’s fine.

Non-contact infrared digital thermometer

These thermometers are brilliant for accurate and remote measurement of temperature. (This was a great recommendation from my colleague Paul.) I use mine quite a lot for things like cooking and checking the oven temperature, as well as measuring the surface temperature of my feet, how cold the walls are, etc. It has a LASER pointer as well to mark what you’re measuring the temperature of. I think this is very similar to the device used by Gale Boetticher in Breaking Bad when he’s making tea, if that means anything to you :)

Ear plugs for loud concerts and public transport

I’m sure that my hearing was somewhat damaged from gigs and nightclubs when I was younger; it always seemed to be particularly bad after going to small venues where the treble is far too powerful. (I wonder if this is because the sound engineers have damaged the higher ranges of their hearing over the course of their working lives and then compensate for that, damaging the customers’ hearing in the same way, and so on…)

To avoid further damage to my hearing nowadays, I always take ear plugs with a fairly flat frequency response along to gigs. They can never be ideal: you often get a huge amount of bass sound through bone conduction at loud gigs, and the ear plugs can’t do anything about that. It’s probably better to have rather too bassy sound than damage your hearing, though.

The ones I’ve linked to from the heading have switchable filters for different levels of sound attenuation – the ones I have don’t have that feature, but in retrospect it would have been nice if they did.

I carry these in my bag all the time, and it’s also frequently useful for blocking out sound on public transport as well, to give yourself some peace and quiet.

Raspberry Pi 3 Model B

I think it’s always possible to think of a fun new project that a Raspberry Pi would be useful for, and the Pi 3 is a big step forward from the previous models, having Wi-Fi and Bluetooth built-in.

I really like the lovely Pimoroni cases, like these, which I got for our Pi 3, and they do other nice accessories.

A good battery charger

NiMH rechargeable batteries are really good nowadays, and save you quite a bit of money if you have lots of devices that use AA / AAA batteries. We got this battery charger in part because I believe it is the same model, albeit with different branding, as the Wirecutter’s “Also Great” pick – it has more features than their basic suggestion. (There’s a useful FAQ for it in an Amazon review of the US version.) Although the UI isn’t very intuitive, you can use the device to calculate the capacity of your existing batteries, which is really helpful – when we first got it I went through all of our existing rechargeable batteries to work out which were worth keeping and which should be replaced.

“The Knot Shop Man” books

My friend and colleague Dave Whiteland wrote an amazing of series of books called “The Knot Shop Man”, which are described as being for “smart children or thoughtful adults”. The theme of knots runs through all the books (which you can read in any order) and they come bound up in a very special knot. (After finishing reading them, you should try to retie it. :)) Not enough people know about these books, and I think the ones he’s selling at the moment are limited in quantity.

A Network Attached Storage device

If you don’t have any Network Attached Storage, I think you’ll be surprised at how soon you come to rely on it, both for backups and for storing music and videos on to stream to your phone, TV, or whatever.

For me, one of the Synology 2 drive bay boxes seemed about right, and it’s been brilliant so far. (Another of my excellent colleagues, Zarino, wrote a blog post about the initial setup of these.)

An AeroPress coffee maker

This is our favourite way of making coffee, for one or possibly two people.

A cut-resistant glove

If you have one of those excellent Microplane graters or a mandoline and are as clumsy as me, you’re probably familiar with the experience of accidentally slicing your hands when using them. You can partially solve this problem with a cut-resistant glove. (I say “partially” because this then shifts the problem to “remembering to use the cut-resistant glove”.)

A Mu folding-plug charger

These are a lovely design, which makes a UK 3-pin plug take up less space by letting you fold it away, and provides 2.4A USB port.

Butchery course at the Ginger Pig

Presents that are experiences rather than things-that-take-up-physical-space often work out well. An interesting one of these we did is a course in butchery at the Ginger Pig – you can do a class on pork, beef or lamb and you get both a big meal and a lot of meat to take away, as well as learning about what each cut of meat is good for.

Bluetooth keyboard

I really try to avoid using Bluetooth, because, well, it’s terrible, and gives me flashbacks to the worst job I ever had, working on “personal area networking”. But this keyboard has actually been pretty good – you can have it paired to three different devices and swap between them easily. (It doesn’t seem to be easy to find a keyboard that can be paired with more than three devices, but maybe someone knows of one.)

Bose noise-cancelling headphones

(I have an earlier version of these, but the QC25 seems to be the current equivalent.) I gather that headphone connoisseurs don’t particularly rate the sound quality of these, but basically everyone agrees that the noise cancelling is amazing. (The sounds seems great to me, for what it’s worth, but I’m not an audiophile.) For long coach, train and plane trips they’re fabulous, if quite bulky to take with you. They are expensive, though.

Sliding webcam covers

This is a set of 5 little sliding webcam covers. The idea is that if someone malicious gains remote access to your computer, then the impact if they can also see everything from your webcam is much worse than if they can’t. These little windows are really cheap and mean that you can just open the webcam window when you actually need it.

A bumper case and screen protector for your phone

I think I probably drop my phone about once a day, but haven’t had smashed its screen yet, thanks to having a good bumper case and screen protector. The Ringke Fusion line of bumper cases (which they seem to do for most current phones) are the ones I’ve used for a while, and as a very clumsy person I can testify that they protect your smartphone very well.

As a screen protector I’m currently using one of the “Invisible Defender Glass” models.

A strangely unpopular feature of AV receivers: HDMI-CEC remote control pass-through

We had a couple of power cuts in quick succession recently, and the fluctuations in power around that time killed our old AV receiver. ¹ (If you’re not sure what such a device is for, I’ve described how we use it in footnote 2.) In shopping for a replacement, I thought that to a reasonable approximation I only needed to look at one feature:

  • How many HDMI inputs does it have?

… since having at least 5 is useful, and plenty of the cheaper devices on the market only have inputs for 3 or 4 HDMI devices.

But this turned out to be naïve. I’d just assumed that all AV receivers would have the second most important feature I actually want, which is:

  • HDMI-CEC remote control pass-through. This means that when you press the arrow keys, play, pause, etc. on the AV receiver’s remote control, those commands are sent to the HDMI input device that’s currently active.

Why is that useful? In our case, because two of the connected devices are a Raspberry Pi (running Kodi) and a Playstation 4, both which can be controlled this way just from the AV receiver’s remote control. ³ Without this feature we’d need to get an IR receiver for the Raspberry Pi or have a Bluetooth keyboard by the sofa, and use the PS4’s Dualshock 4 controller, which isn’t great as a remote control – the shoulder buttons tend to get pushed when you put it down. Even if we got those alternatives controllers, it’d mean two additional remote controls to juggle.

However, I’ve discovered the expensive way that this fantastically useful feature is:

  • Very variably supported – even devices that claim to support HDMI-CEC don’t necessarily do the remote control pass-through.
  • Not widely known about – the two otherwise well-informed people I spoke to at Richer Sounds hadn’t even heard of this feature.
  • Not typically listed in manufacturers’ specifications, so it’s very hard to tell if a device you’re going to buy will actually support it.

I don’t understand

Clearly I’m missing something about the world of home cinema / Hi-Fi, otherwise more people would know and care about this feature.

The number of different remote controls you need is one of the most obvious ways that usability of home cinema systems tends to be awful. Is it the case that most of the people in the market for AV receivers don’t care about this? Do they live with people who aren’t Hi-Fi enthusiasts, and how do those people deal with it?

Theory 1 – they live with the complexity

Perhaps they just live with the complexity: they keep four or five remote controls around all the time, and effectively write an operations manual for their families or house guests who might need to try something as extraordinary as, I dunno, watching TV, or using Netflix.

Theory 2 – typically people have few input devices

Maybe people are OK with two or three remote controls, and this usability problem only becomes really bad when you end up needing four or five. (I don’t buy this, really – even with two or three remote controls the situation’s pretty bad.)

Theory 3 – more people use all-in-one remotes than I expect

Of course, you can buy all-in-one remote controls that can be taught the IR signals sent by any of your existing remote controls so that they can be sent from one device. Maybe these are much more popular than I imagine? I can’t think of anyone I know who has one, but obviously that’s not a very representative sample.

The problems with an all-in-one IR remote control for us would be that (a) the PS4 controller uses Bluetooth, so it wouldn’t work for that, and (b) we’d still need to get an IR receiver + remote for the Raspberry Pi.

In any case, these devices seems so inelegant compared to the HDMI-CEC solution – why should the remote control have to preserve the state indicating which device’s IR codes should be sent? With the CEC approach, the AV receiver knows which source is active and it should send the command to.

30 Rock - "St Valentine's Day" mention of universal remote controls
Included only because if there’s even a tangential reference in 30 Rock to what you’re talking about, you should include it…

Theory 3 – there’s some other solution I don’t know about

This seems quite likely, and if it’s correct, I’d really like to know what the answer is! Maybe everyone with several media players connected to their AV receiver these days is controlling them with terrible iPad apps, or telekinesis or something.

Not very clear conclusions

My guess is that usability by non-experts has never been a big concern to manufacturers of relatively expensive AV equipment; usability isn’t their central concern in the way that it is for Amazon or Zipcar, say. Their market seems to care a lot about things that matter little to me (e.g. huge numbers of surround speakers, differences in audio quality that are difficult to detect, etc.) and maybe they think that their target market just doesn’t care about the inconvenience of needing five or so remote controls in reach.

For music alone, I suppose people who care about usability (but not necessarily open standards) are well served by iPhone docks with active speakers, or systems like Sonos. I’m not sure if there are corresponding solutions that are easier for people to use if you have multiple media sources that include video too.

For the moment, I’m a bit torn between:

  • Returning the current replacement receiver I bought (a Denon model that advertises HDMI-CEC support, but doesn’t send the remote control commands, grrr), which  a hassle, and probably involves explaining this blog post to people at Richer Sounds, which I don’t relish. Then trying to find a replacement that does support this feature.
  • Living with the additional complexity. Something that would mitigate that problem a bit would be getting the PS4 universal remote control, which is a bit like an all-in-one remote that also speaks the PS4’s bluetooth controller protocol. We’d need to get an IR receiver / remote control for the Raspberry Pi, too, of course.

Anyway, this all seems very unsatisfactory to me, but maybe I’m missing something really obvious.


¹  One lesson from this experience, which in retrospect should have been obvious, is that if you have a surge-protected extension lead with a “SURGE PROTECTED WHEN LIT” indicator light, it’s not doing any surge protection if that light’s off — it being off typically means it’s protected you from some surge in the past and isn’t providing protection any more; some kind of fuse in it has blown.

²  For me, the AV receiver takes the place that my old Hi-Fi amplifier used to play: it’s an amplifier that the good speakers are plugged into. However, the AV receiver also sends its video output to the TV, and all its inputs are HDMI, which can carry audio and / or video, as opposed to just audio. So the AV receiver is the only device plugged into our TV, and the input devices that send it audio and video to the receiver are:

  • A PlayStation 4, for games, Netflix, Amazon Prime Video, DVDs and Blu-Rays.
  • A Raspberry Pi running Kodi, for playing audio files and films and TV that I’ve ripped.
  • A Chromecast, which we stream music and YouTube video to.
  • A TiVo, for random broadcast TV
  • An occasionally connected laptop, via an HDMI to MiniDisplayPort cable, for LuckyVoice at home :)

So basically it’s a way of having audio and video on the best output devices in the flat, no matter what the input source is.

³  I’m making it sound like the remote control pass-through worked perfectly with our old AV receiver; that’s not quite the case. The one big annoyance was that there was no way to do the equivalent of pressing the “Playstation” button on the PS4’s DualShock 4 controller, which you need to turn it on and select a user. Otherwise it did basically send all the commands we needed for watching streaming video, DVDs and Blu-Rays on the PS4, and everything we needed to operate Kodi from day-to-day.

⁴  I gather from hearsay that HDMI-CEC support in general is a bit of mess: whether it’s because the specification isn’t strict enough or manufacturers are just implementing it very differently isn’t clear to me. Maybe someone can summarize that? (We’ve certainly had problems in the past with the power state change signals causing devices to power on again after you’ve just shut another one down, for example.) Still, the remote control pass-through worked well for us.

⁵  I asked Denon whether I was just missing a menu option to turn on support for HDMI-CEC remote control pass-through, and it seems that I wasn’t – here’s how that correspondence went:

I recently bought a Denon AVR-X2300W AV receiver, which claims in the specification on your website to support HDMI-CEC.  However, the feature of HDMI-CEC that I really need doesn’t seem to be working: remote control pass-through.

To be clear, this means that when I use arrow buttons, play button, etc. on the AVR-X2300W’s remote control, those commands should be sent to the current HDMI source device. This doesn’t work, although I have Setup > Video > HDMI Setup > HDMI Control set to “On”. I can’t see any other option in that menu that might turn on that particular feature of HDMI-CEC.

This is a really crucial feature for me, and it worked fine on the (much cheaper!) Yamaha AV receiver I had previously. Is there some other option I need to select to enable HDMI-CEC remote control pass-through or any way to get this to work?

I’d appreciate any information you can give me about this – if this feature is not supported I may have to return it to the shop :(

Many thanks,

And I got this reply:

Thank you for your inquiry.

CEC protocol is a standard but its mandatory definitions do not include many features and functionality. These are considered as extended features and may or may not be implemented by different manufacturers. You maybe simply experiencing the difference in different manufacturers implementation of CEC. As a consumer the only potential solution for this would be ensuring that all equipment is running on the latest version of firmware.  ( relevant settings should also be made on the equipment ). As extended features are not guaranteed you may need to use  alternative methods for control.

Apologies for any inconvenience.

DENON Customer Support

Migrating YourNextRepresentative from PopIt to django-popolo

This post was originally intended for the mySociety blog, but because of its length and technical content people suggested to me it might be more suitable for my own blog. This about the development of the YourNextRepresentative project (GitHub link), and in particular why and how we migrated its storage system from using the PopIt web service to instead use PostgreSQL via django-popolo. Hopefully this might be of interest to those who have been following the development of YourNextRepresentative (previously YourNextMP).


YourNextMP for 2015

For the 2015 General Election in the UK, we developed a new codebase to crowd-source the candidates who were standing in every constituency; this site was YourNextMP. (Edmund von der Burg kindly let us use the domain name, which he’d used for his site with the same intent for the 2010 election.) We were developing this software to help out Democracy Club, who ran the site and built up the community of enthusiastic contributors that made YourNextMP so successful.

At the time, we saw this crowd-sourcing challenge as a natural fit for another technology we had been developing called PopIt, which was a data store and HTTP-based API for information about people and the organizations they hold positions in. PopIt used the Popolo data model, which is a very carefully thought-out specification for storing open government data. The Popolo standard helps you avoid common mistakes in modelling data about people and organisations, and PopIt provided interfaces for helping people to create structured data that conformed to that data model, and also made it easily available.

The original intention was that YourNextMP would be quite a thin front-end for PopIt but, as development progressed, but it became clear that this would provide a very poor user experience for editors. So, the YourNextMP wrapper, even in the very early versions, had to provide a lot of features that weren’t available in PopIt, such as:

  • A user experience specific to the task of crowd-sourcing election candidate data, rather than the very generic and unconstrained editing interface of PopIt’s web-based UI.
  • Versioning of information about people, including being able to revert to earlier versions.
  • Lookup of candidates by postcode.
  • Summary statistics of progress and completion of the crowdsourcing effort.
  • Logging of actions taken by users, so recent changes could be tracked and checked.
  • CSV export of the data as well as the JSON based API.
  • etc. etc.

Later on in development we also added more advanced features such as a photo moderation queue. To help support all of this, the YourNextMP front-end used a traditional RDBMS (in our case usually PostgreSQL), but effectively PopIt was the primary data store, which had all the information about people and the posts they were standing for.

This system worked fine for the general election, and I think everyone considered the YourNextMP project to be a great success  – we had over a million unique users, and the data was extensively reused, including by Google for their special election information widget. (You can see some more about the project in this presentation.)

Turning YourNextMP into YourNextRepresentative

There had been a considerable demand to reuse the code from YourNextMP for other elections internationally, so our development efforts then focussed on making the code reusable for other elections. The key parts of this were:

  • Separating out any UK-specific code from the core application
  • Supporting multiple elections generically (the site was essentially hard-coded to only know about two elections – the 2010 and 2015 UK general elections)
  • Internationalizing all text in the codebase, so that it could be localized into other languages.

We worked on this with the target of supporting a similar candidate crowd-sourcing effort for the general election in Argentina in 2015, which was being run by Congreso Interactivo, Open Knowledge Argentina and the Fundación Legislativo. This new version of the code was deployed as part of their Yo Quiero Saber site.

Since the name “YourNextMP” implies that the project is specific to electoral systems with  Members of Parliament, we also changed the name to YourNextRepresentative. This name change wasn’t just about international re-use of the code – it’s going to be supporting the 2016 elections in the UK as well, where the candidates won’t be aspiring to be MPs, but rather MSPs, MLAs, AMs, mayors, PCCs and local councillors.

Problems with PopIt

It had become apparent to us throughout this development history that PopIt was creating more problems than it was solving for us. In particular:

  • The lack of foreign keys constraints between objects in PopIt, and across the interface between data in PostgreSQL and PopIt meant that we were continually having to deal with data integrity problems.
  • PopIt was based on MongoDB, and while it used JSON schemas to constrain the data that could be stored in the Popolo defined fields, all other data you added was unconstrained. We spent a lot of time of time writing scripts that just fixed data in PopIt that had been accidentally introduced or broken.
  • PopIt made it difficult to efficiently query for a number of things that would have been simple to do in SQL. (For example, counts of candidates per seat and per party were calculated offline from a cron-job and stored in the database.)
  • Developers who wanted to work on the codebase would have to set up a PopIt locally or use our hosted version; this unusual step made it much more awkward for other people to set up a development system compared to any conventional django-based site.
  • PopIt’s API had a confusing split between data that was available from its collection-based endpoints and the search endpoints; the latter meant using Elasticsearch’s delightfully named “Simple Query String Query” which was powerful but difficult for people to use.
  • It was possible (and common) to POST data to PopIt that it could store in MongoDB but couldn’t stored in Elasticsearch, but no error would be returned. This long-standing bug meant that the results you got from the collections API (MongoDB-backed) and search API (Elasticsearch-backed) were confusingly inconsistent.
  • The latency of requests to the API under high load meant we had to have a lot of caching. (We discovered this on the first leaders’ debate, which was a good indication of how much traffic we’d have to cope with.) Getting the cache invalidation correct was tricky, in accordance with the usual aphorism.

At the same time, we were coming to the more broad conclusion that the PopIt project hadn’t been achieving the goals that we’d hoped it would, despite having putting a lot of development time into it, and we decided to stop all development on PopIt. (For many applications of PopIt we now felt the user need was better served by the new EveryPolitician project, but in YourNextRepresentative’s case that didn’t apply, since EveryPolitician only tracks politicians after they’re elected, not when they’re candidates.)

As developers we wanted to be able to use a traditional RDBMS again (through the Django ORM) while still getting the benefits of the carefully thought-out Popolo data model. And, happily, there was a project that would help use to do exactly that – the django-popolo package developed by openpolis.

Migrating YourNextRepresentative to django-popolo

django-popolo provides Django models which correspond to the Popolo Person, Organisation, Membership, Post and Area classes (and their associated models like names and contact details).

We had used django-popolo previously for SayIt, and have been maintaining a fork of the project which is very close to the upstream codebase, except for removing the requirement that one uses django-autoslug for managing the id field of its models. We opted to use the mySociety fork for related reasons:

  • Using the standard Django integer primary key id field rather than a character-based id field seems to be closer to Django’s “golden path”.
  • There are interesting possibilities for using SayIt in a YourNextRepresentative site (e.g. to capture campaign speeches, or promises) and using the same version of django-popolo will make that much easier

Extending django-popolo’s models

Perhaps the biggest technical decision about how to use the django-popolo models (the main ones being Person, Organization, Post and Membership) is how we extended those models to add the various extra fields we needed for YourNextRepresentative’s use case. (With PopIt you could simply post a JSON object with extra attributes.) The kinds of extra-Popolo data we recorded were:

  • The one or many elections that each Post is associated with.
  • The particular election that a Membership representing a candidacy is associated with.
  • The set of parties that people might stand for in a particular Post. (e.g. there’s a different set of available parties in Great Britain and Northern Ireland).
  • The ‘versions’ attribute of a Person, which records all the previous versions of that person’s data. (We considered switching to a versioning system that’s integrated with Django’s ORM, like one of these, but instead we decided to just make the smallest incremental step as part of this migration, which meant keeping the versions array and the same JSON serialization that was used previously, and save switching the versioning system for the future.
  • Multiple images for each Person and Organization. (django-popolo just has a single ‘image’ URL field.)
  • Whether the candidates for a particular Post are locked or not. (We introduced a feature where you could lock a post once we knew all the candidates were correct.)
  • To support proportional representation systems where candidates are elected from a party list, each Membership representing a candidacy needs a “party_list_position” attribute to indicate that candidate’s position on the party list.
  • etc.

Perhaps the most natural way of adding this data would be through multi-table inheritance; indeed, that is how SayIt uses django-popolo. However, we were wary of this approach because of the warnings in Two Scoops of Django and elsewhere that using multi-table inheritance can land you with difficult performance problems because queries on the parent model will use OUTER JOINs whether you need them or not. We decided instead to follow the Two Scoops of Django suggestion and make the one-to-one relationship between parent and child table explicit by creating new models called PersonExtra, PostExtra, etc. with a `base` attribute which is a OneToOneField to Person, Post, etc., respectively. This means that the code that uses these models is slightly less clear than it would be otherwise (since sometimes you use person, sometimes person.extra) but we do have control over when joins between these tables are done by the ORM.

Data migration

Once the Extra models were created, we wrote the main data migration. The idea of this was that if your installation of YourNextRepresentative was configured as one of the known existing installations at the time (i.e. the ELECTION_APP setting specified the St Paul, Burkina Faso, Argentina or the UK site) this data migration would download a JSON export of the data from the corresponding PopIt instance and load it into the django-popolo models and the *Extra models that extend them.

As the basis for this migration, we contributed a PopIt importer class and management command upstream to django-popolo. This should make it easier for any project that used to use PopIt to migrate to django-popolo, if it makes sense for them to do so. Then the data migration in YourNextRepresentative subclassed the django-popolo PopItImporter class, extending it to also handle the extra-Popolo data we needed to import.

(A perhaps surprising ramification of this approach is that once an installation has been migrated to django-popolo we should change the name of that country’s ELECTION_APP, or otherwise someone setting up a new site with that ELECTION_APP configured will have to wait for a long time for out-of-date data to be imported on running the initial migrations. So we will shortly be renaming the “uk_general_election_2015” application to just “uk”. To support people who want that feature (cloning an existing site to a development instance) we’ve added a new “candidates_import_from_live_site” management command that uses its new API to mirror the current version of an existing instance.)

Another issue that came up in working on this data migration is that we needed to preserve any identifiers that were used in URLs on the site previously so that after upgrading each site every bookmarked or search-engine-indexed URL would still show the same content. In the case of the Person model, this was easy because it used an integer ID previously. Parties and posts, however, used strings as their IDs. We migrated these IDs to fields called ‘slug’ (perhaps a slightly misleading name) on the OrganisationExtra and PostExtra models.

This turns out to be quite a slow migration to run – as well as importing the core data, it also downloads every image of people and parties, which is pretty slow even on a fast connection.

Updating views, tests and management commands

The next part of the migration was updating all the code that previously used PopIt’s API to instead use the new Django models. This was a significant amount of work, which left very little code in the project unchanged by the end. In general we tried to update a test at a time and then change the core code such that the test passed, but we knew there was quite a bit of code that wasn’t exercised by the tests. (One nice side-effect of this work is that we greatly improved the coverage of the test suite.)

We did think about whether we could avoid doing this update of the code essentially in one go – it felt rather like a “stop-the-world refactoring”; was there an incremental approach that would have worked better? Maybe so, but we didn’t come up with one that might reasonably have saved time. If the old code that interacted with the PopIt API had been better encapsulated, perhaps it would have made sense to use proxy models which in the migration period updated both PopIt’s API and the database, but this seemed like it would be at least as much work, and it was lucky that we did have a period of time we could set aside for this work during which the sites weren’t being actively updated.

Moving other code and configuration to the database

We also took the opportunity of this migration to introduce some new Django models for data that had previously been defined in code or as Django settings. In particular:

  • We introduced Election and AreaType models (previously the elections that a site supported and the types of geographical boundary they were associated with were defined in a dictionary in the per-country settings).
  • We introduced a PartySet model – this is to support the very common requirement that different sets of parties can field candidates in different areas of the country.
  • We replaced the concept of a “post group” (definied in code previously) with a “group” attribute on PostExtra

These all had the effect of simplifying the setup of a new site – more of the initial setup can now be done in the Django admin interface, rather than needing to be done by someone who’s happy to edit source code.

Replacing PopIt’s API

One of the nice aspects of using PopIt as the data store for the site was that it supplied a RESTful API for the core data of the site, so we previously hadn’t had to put much thought into the API other than adding some basic documentation of what was there already. However, after the migration away from PopIt we still wanted to provide some HTTP-based API for users of the site. We chose to use Django REST framework to provide this; it seems to be the most widely recommended tool at the moment for providing a RESTful API for a Django site (and lots of people and talks at djangocon EU in Cardiff had independently recommended it too). Their recommendations certainly weren’t misplaced – it was remarkably quick to add the basic read-only API to YourNextRepresentative. We’re missing the more sophisticated search API and various other features that the old PopIt API provided, but there’s already enough information available via the API to completely mirror an existing site, and Django REST framework provides a nice framework for extending the API in response to developers’ needs.

The end result

As is probably apparent from the above, this migration was a lot of work, but we’re seeing the benefits every day:

  • Working on the codebase has become a vastly more pleasant experience; I find I’m looking forward to it working on it much more than I ever did previously.
  • We’ve already seen signs that other developers appreciate that it’s much more easy to set up than previously.
  • Although the tests are still far from perfect, they’re much more easy to work with than previously. (Previously we mocked out a large number of the external PopIt API requests and returned JSON from the repository instead; this would have been a lot better if we’d used a library like betamax instead to record and update these API responses, but not having to worry about this at all and just create test data with factory_boy is better yet, I think.)

I think it’s also worth adding a note that if you’re starting a new site that deals with data about people and their memberships of organizations, then using the Popolo standard (and if you’re a Django developer, django-popolo in particular) can save you time – it’s easy to make mistakes in data modelling in this domain (e.g. even just related to names). It should also help with interoperability with other projects that use the same standard (although it’s a bit more complicated than that – this post is long enough already, though :))

The UK instance of YourNextRepresentative (at has been using the new codebase for some time now, and that will be relaunched shortly to collect data on the 2016 elections in the UK.

If you’re interested in starting up an candidate crowd-sourcing site using YourNextRepresentative, please get in touch with our international team or email for any other issues.

Printing out GitHub issues for triage or estimation

On one of the projects I’ve been working on at mySociety had a large number of open issues, most of which hadn’t been looked at for some time, and didn’t have estimates of difficulty. To address this we tried a variation of an exercise that’s suggested in “The Scrum Field Guide” in the chapter called “Prioritizing and Estimating Large Backlogs”, which involves printing out all the stories in the product backlog onto index cards and arranging them on a wall roughly in order of difficulty from left to right. (A second stage is to let the product owner adjust the height of each story on the wall to reflect its priority.) It seemed as if this might be a helpful way of estimating our large number of open issues, even though they’re not written as users stories.

An obvious problem here was that we didn’t have an easy way of GitHub tickets en masse, so I wrote a script that would generate a PDF for each open issue (excluding those that are pull requests) in a repository’s issue tracker using Pandoc. That script is here in case it’s of use to anyone:

As it turned out, this session was pretty successful – as well as the value of generating lots of difficulty estimates, it was a good way of getting a picture of the history of the project, and starting lots of discussions of issues that only one member of the team knew about.

Cryptic Crossword for Sarah

This is a cryptic crossword that Jenny and I wrote for my sister for her birthday, and I thought it might be worth putting up here (with her permission) in case anyone wants to give it a try. Many of the answers were chosen because the word might have some particular meaning for her, but I think they should all be generally accessible words. 7a relies on a reference which may be unfamiliar to some, however. The intention was to make the clues pretty easy, since she’s a relatively new solver, and the grid we ended up with to fit in all the themed words is quite a tough one.

Here’s a link to a PDF of the crossword.

Books in 2012

This is a rough list of the books I read in 2012 with some brief comments – I’ve seen other people do this on their blogs and enjoyed reading their summaries, so thought that I would have a go. (Originally I added a mention of each person who recommended one of these to me, but that turned out to be problematic, both in privacy and completeness terms – let me just say that I’m always grateful for the wonderful recommendations I get from friends and colleagues.)

Fear and Loathing in Las Vegas – Hunter S. Thompson

I’ve read it at least twice before – this time I didn’t feel the momentum of the writing bowling me along quite as it did when I first read it, but that’s probably to be expected.

Bury the Chains – Adam Hochschild

An excellent account of the movement in Britain to abolish slavery. It’s still very relevant today if you’re interested in activism and campaigning, and, as the author points out, it’s the story of the first mass movement where a group campaigned for the rights of people other than themselves.

Mort – Terry Pratchett

This is the first time I’d re-read it in many years, although I usually suggest it (as do many people) as the first Terry Pratchett novel to try if you’ve never read any of his Discworld novels before.  I was glad to find still every bit as enjoyable as the first time. (At the end of the previous year I’d finished the Discworld series, after not reading any of them roughly between the ages of 18 and 32.)

Mockingjay – Suzanne Collins

I’d listened to the first two of the Hunger Games novels as audiobooks when I still had Emusic‘s “one audiobook a month” deal. It’s worth finishing the trilogy if you’ve read the first two, which I also enjoyed despite finding the character of Katniss terribly frustrating (as is probably intended).  The nature of the resistance movement is interesting, and I found the conclusion satisfying.

Knots and Crosses – Ian Rankin

Hide And Seek – Ian Rankin

Tooth and Nail – Ian Rankin

Strip Jack – Ian Rankin

The Black Book – Ian Rankin

Mortal Causes – Ian Rankin

Let It Bleed – Ian Rankin

Black and Blue – Ian Rankin

The Hanging Garden – Ian Rankin

Dead Souls – Ian Rankin

Set in Darkness – Ian Rankin

The Falls – Ian Rankin

These are the first 12 of Ian Rankin’s Rebus series of detective novels, and I’ll probably read the last few of them in 2013. I found them quick to read, and I loved the portrait of Edinburgh – it’s much more true to my experiences of the city both as a child and an adult than virtually any other books I’ve read that are set there.  I found the character of Rebus compelling as well.  The first two novels (Knots and Crosses and Hide and Seek) have some points where the writing really jarred for me, but from the third onwards they’re consistently well-written.

The Player of Games – Iain M. Banks

I seem to re-read this about once a year – it’s one of my favourite of Iain M. Banks’ novels (along with Consider Phlebas and Excession). I’ve actually stopped reading his new books, the last straw being The Steep Approach to Garbadale, where I spent most of the book thinking “oh, please don’t let the twist be the one that I think is coming, [a disturbing theme from several previous novels]”, and of course it was…

The Open-Focus Brain – Les Fehmi

This was a kind present from a student in Zürich who was concerned about my style of concentration. The idea of the book is that one can switch from a very intense “fight or flight” style of concentration to something the author describes as “open focus” by using various techniques, in particular thinking about volumes of space. It comes with a CD with some example exercises. When I’ve tried this it’s certainly been relaxing – I hadn’t tried any form of meditation before. It’s also interesting that thinking about the space around you is also something that’s used in the Alexander technique and (I’m told) various meditation techniques. Unfortunately, the book is written in a self-help style that makes me rather suspicious of it, particularly the wide-ranging claims for its health benefits sometimes based on single cases or unpublished data.

Moonwalking with Einstein – Joshua Foer

A book about memory champions and memory techniques of the “a journalist tries to become really good at something relatively obscure” genre (see also “Word Freak”, “Born To Run”, etc.) It’s a quick read, and if you’re interested in the subject of memory, or building expertise Anders Ericsson-style, worth a look.  In a strange coincidence, 6 months after I read this, my partner started work at a company run by one of the memory champions who appears in the book.

Why Have Kids? – Jessica Valenti

This probably isn’t what it sounds like from the title – it’s an excellent discussion of the absurd pressures on mothers (particularly in US culture) to meet impossible standards. It was written by Jessica Valenti after having her first child.

The Revolution will be Digitised – Heather Brooke

An interesting account of the Wikileaks saga from Heather Brooke, who is a fantastic campaigner on freedom of information issues. The most extraordinary sections deal with Julian Assange’s interactions with her, including his advances wherein he identifies himself with Jesus…

Born to Run – Christopher McDougall

An entertaining book about ultra-runners, barefoot-style running and the Tarahumara people, who run huge distances without apparently being susceptible to the injuries that plague runners elsewhere. It’s great in terms of adventure and storytelling, with lovely portraits of the runners the author meets. I wouldn’t read it if you’re looking for rigorous science and anthropology, but it’s excellent fun.

Rocket Surgery Made Easy – Steve Krug

I generally haven’t added work-related books to this list, but since I read this one straight through, it seemed as if it fitted better than then others which I tended to dip into and out-of more. This is by Steve Krug as a follow-up to “Don’t Make Me Think”, and aims to give you step-by-step guidance for running DIY usability testing sessions of websites.  I found the style somewhat irritating (as so often where technical books try for a light touch) but in practice it was very helpful for running some usability testing last year.  I’m hoping I’ll get to do that again next year.

The House of Silk – Anthony Horowitz

A new Sherlock Holmes novel by Anthony Horowitz. It’s a disturbing book, but brilliantly matches the style of the Holmes stories, I thought.

Plan of Attack – Bob Woodward

Bob Woodward’s book about the decisions that led to the invasion of Iraq in 2003 – I found this frankly terrifying.

The Sense of an Ending – Julian Barnes

The 2011 Booker prize winner – consummately well-written and an excellent read.

The Narrative of John Smith – Arthur Conan Doyle

The unpublished first novel of Arthur Conan Doyle. I found it tiresome, and not terribly interesting – I wouldn’t recommend it unless, I suppose, you’re a Conan Doyle completist.

Catherine of Aragon: Henry’s Spanish Queen – Giles Tremlett

A sympathetic biography of Catherine of Aragon – embarrassingly, I know virtually nothing of Tudor history, so this wasn’t just interesting, but almost entirely novel for me…

The Strange Case of Dr Jekyll and Mr Hyde – Robert Louis Stevenson

I’d forgotten how short this story is, in fact, but it’s still hugely enjoyable. It’s tense throughout, and I’d forgotten about the interesting character of the narrator, a quiet man who’s happy sitting in silence by the fire with his old friends.

Kidnapped – Robert Louis Stevenson

I don’t think that I’d ever read the original text of Kidnapped before, just a children’s edition and later a graphic novel version. Anyway, it’s still a great adventure and Balfour’s journey crosses lots of parts of Scotland that I know from past walking holidays.

The Ascent of Rum Doodle –  W. E. Bowman

An excellent comic novel from the 1950s about a group of mountaineers attempting to climb the fictional mountain “Rum Doodle” with the help of thousands of porters and a supply of Champagne for “medicinal purposes”. Very silly, and highly recommended.

Care of Wooden Floors – Will Wiles

This is the first novel by a friend I knew at college. I enjoyed this very much – I’ve been accused of giving away too much when discussing it before, so I’ll just say that it’s hilarious and agonising in equal parts, and definitely worth reading.

Wolf Hall – Hilary Mantel

Another Booker prize winner; my ignorance of history probably meant that I missed quite a lot of pleasure in associating the wonderfully drawn characters with their actual stories, but I enjoyed it nonetheless. The (much discussed) style of the writing, in particular the use of the pronoun “he”, initially irritated me a great deal, but I got used to it after a hundred pages or so. I really did think that Wolf Hall was good, but I don’t really see why so many people are quite so passionate about it.

The Afterparty – Leo Benedictus

Another novel written by someone that I used to know, and happily another great read. I found this story of a naive journalist caught up in events around a celebrity party completely gripping, and I enjoyed the meta-narrative too – it’s a smart and very well-written novel.

How To Be A Woman – Caitlin Moran

An often hilarious book – there are several sections that had both of us laughing out loud. The subject matter’s also great – I think it’s essentially aimed at younger people who would say that they’re in favour of equal rights for women, but bizarrely would also say that they’re not feminists.

The Blank Wall – Elisabeth Sanxay Holding

Elisabeth Sanxay Holding was an author that Raymond Chandler regarded as “the top suspense writer of them all”, but who is relatively unread. (You can get it from Persephone Books.) This is the story of a housewife during WWII who is trying to protect her family from events that are going out of control, a bit like Brief Encounter crossed with hardboiled detective fiction – I enjoyed it very much.

Quiet – Susan Cain

This is probably the most personally and professionally relevant book I read this year, as someone who’s introverted but works hard to not give that impression.  It’s about the ways in which society increasingly values the qualities we associate with extroversion rather than introversion, and why that might be a mistake. I found it encouraging and thought-provoking – I’d recommend it to anyone who’s quiet, introverted or sensitive (the book deals with several associated personality traits). There’s also a little material towards the end about how these traits can affect relationships which struck quite a few chords with me.