Open Data GB postcode unit boundaries


In October 2020 there was big news for postcode geography geeks like me: the ONS added a postcode column to their NSUL dataset, which gave us for the first time a coordinate and postcode for each property in Great Britain as open data. I’d previously tried to generate polygons for the boundary of each postcode based on the single coordinate per postcode found in ONSPD, but the results were very approximate. With this new data I could use a very similar technique, but this time with coordinates for all the postal addresses in each postcode to get much more accurate postcode unit polygons.

This turned out to be quite a lot of work, but the results are pretty satisfying, I think. The two biggest problems with the end results are:

  1. NSUL only covers Great Britain; I don’t know of any similar dataset for Northern Ireland.
  2. NSUL doesn’t just contain UPRNs that represent deliverable postal addresses; it also has coordinates and postcodes for things like whole streets, pollution monitoring devices, verges and also historic UPRNs and so on. Since I can’t filter these out with any open data that I know of, these points distort some of the boundaries. See the section “How does this compare to ‘Code-Point with Polygons’?” below.

However, as far as I know these are the best open data boundaries of GB postcode units available for free, and they could still be improved by some work on excluding UPRNs that aren’t current postal addresses. (See “What could be improved?” below for some suggestions.)

What is this dataset useful for?

I suppose that the main use-case for these boundaries is that you have some data where values are associated with postcodes, and you want to visualize them on a map.

For me personally, it also just sates my curiosity about the notional shapes of postcodes, e.g. where they just run down one side of a street, how new developments break up existing patterns, etc. etc.

Get the data

You can download GeoJSON for the postcode unit, sector, district and area polygons from here:

Download the data as GeoJSON files (approx. 1GB)

In case you just want to explore the data a bit, you can search for a postcode or location in this MapIt instance and click through to see the postcode unit, sector district or area boundaries. Please don’t rely on this MapIt as an API, since it costs a not insignificant amount to run, and unless there’s a lot of interest in this I’m likely to take it down at some point. If you are interested in using this on an ongoing basis, please get in touch with me.

Future maintenance

I’m publishing this partly because it’s been a fun project and I find the results very pleasing, and partly to gauge interest in this dataset. It’s worth making clear that I’m not committing to making regular releases as NSUL is updated. It’s rather expensive in EC2 time to regenerate this data, and I’m not sure that an individual with very little free time is the right person to be doing that anyway.

If you would be interested in getting regular updates to this data (or working on improving it) or you represent an institution that might have the resources and longevity to take over maintenance of this, please do get in touch with me.


  • This data is only for Great Britain, not Northern Ireland. As far as I know there is no equivalent dataset to NSUL to enable me to generate postcode polygons of similar accuracy to the rest of this dataset for Northern Ireland. If you know of one, please let me know!  The crucial thing is to have the coordinates and postcode for each UPRN in Northern Ireland. NSUL provides this for Great Britain but not for Northern  Ireland.
  • You’ll find that some of the postcode polygons overlap. There
    are broadly two cases where this happens:

    • Genuine “vertical streets”, like the DVLA building in
      Swansea, where there are multiple postcodes for a single
    • Some UPRNs in the NSUL dataset represent historic addresses, where a building previously had a different postcode. Since I can’t filter out these historic UPRNs with open data, you get cases like the one shown below, where SW2 1EP and SW2 1BX overlap because there’s a historic UPRN for a building with postcode SW2 1BX that was knocked down and replaced by a block of flats, each of which has the postcode SW2 1EP.

      An image of postcode polygons in Brixton, showing that SW2 1EP and SW2 1BX overlap due to a single point with multiple UPRNs, one of them historic
      Click through for the animated GIF – this shows the overlap between the polygons I’ve generated for SW2 1BX and SW2 1EP
  • Because of memory constraints, I couldn’t calculate the Voronoi diagram for every point in NSUL in one pass, so I calculated it for each region of Great Britain separately. This means that the postcode units that cross from one region to another (e.g. TD9 0TU or CH4 0DF) aren’t as neat as they might be.
  • This data is based on the October 2020 release of NSUL, which is fairly old now.

How does this compare to ‘Code-Point with Polygons’?

This dataset is similar to the Ordnance Survey’s Code-Point with Polygons product, which is very much not Open Data. (I’m not even sure how much it costs because their pricing is so opaque.)

Going by the description of Code-Point with Polygons on the Ordnance Survey’s website, it sounds like it’s generated with a very similar method to the one I’m using, but unlike me they are able to filter out all the points that don’t represent current deliverable postal addresses, which means that their polygons are higher quality.

We can look at a small example of how the differences between these datasets arise, since the user guide for Code-Point with Polygons contains some images showing the postal addresses used as the basis of the data. Here’s a small part of Portsmouth from that document:

An example of Code-Point with Polygons source points and polygons from the user guide.

Here’s the same region from the data I’ve generated from NSUL, with a few of the extraneous points labelled:

Sounce points from NSUL and postcode unit polygons generated from them (i.e. part of the dataset I’m publishing with this post)

You can see by comparing the points in these images how many additional ones there are in the NSUL dataset compared to the one that the OS are using. I’ve labelled a few of the extra points in the latter image for the sake of showing what kind of things they represent. (I discovered what these are by looking them up in FindMyAddress, but that site’s license agreement is so restrictive that it’s not useful for anything more than an occasional lookup.)

Hopefully this example makes clear the kind of differences you should expect to see between those datasets. In addition to the extra points I can’t filter out, it also seems that the OS polygons have been snapped to the main road. This is something that one theoretically could do with open data as well. (See “What could be improved?” below.)

So, in summary, if you want or need the best quality postcode polygons for Great Britain, you should buy Code-Point with Polygons. I suspect that for a fair proportion of use cases, though, the data I’ve generated might be good enough and is free to use under the Open Government License.

What could be improved?

It’s rather unlikely that I’ll have time to do any of these any time soon, but here are some ideas for how to improve this data:

  • Robin Houston pointed out to me that Output Area boundaries are open data, and each Output Area is a union of a few postcode units. (The mapping from postcode unit to Output Area is readily available.) This means that you could filter out some proportion of irrelevant UPRNs by looking for coordinates that fall outside the Output Area that they should lie in and adding them to an exclude-list. I think this would be worth doing at some point, although you would have to be careful how you did it – e.g. there are plenty of postcodes that didn’t exist at the time of the last census, so you don’t want to accidentally exclude all of the points from those postcodes! Possibly the best time to do this would be when the Output Areas for the 2021 census are published, which is likely to be in April / May 2020.
  • One could pretty easily set up a tool to crowdsource finding points that are clearly outside a postcode. In many cases it is very obvious just by overlaying the positions of UPRNs over OpenStreetMap tiles that some are in the middle of a road, or miles away from the rest of the points in that postcode. You’d have to make it very clear to contributors that they have to do this by local knowledge or common sense, rather than referring to any AddressBase-derived data set, but I think something like this could work pretty well.
  • As I mentioned elsewhere in this post, lots of postcode unit boundaries divide one side of a road from another or appear to run down a river, say, so I can conceive that you could write some code to try to adjust the boundaries of these polygons to features like roads in OpenStreetMap, so long as moving that boundary wouldn’t include points not previously in the postcode boundary. I’m not sure where I’d start with this, though.

One final point: I suspect this won’t be directly helpful, but it’s worth following this FOI request from Robert Whittaker, which asks for a list of historic and parent UPRNs; these are a big chunk of the UPRNs I’d want to filter out of NSUL for this project. Even if the request is eventually successful, it wouldn’t be usable in a project like this, since my aim is to create open data, and even if a list of historic and parent UPRNs were released under FOI they wouldn’t be usable as open data. However it’s interesting to read the refusals as a clear demonstration of  how obviously broken the Ordnance Survey’s incentives and obligations are.

Lessons (re-)learned

  • When you run out of inodes on your filesystem, that’s a clear sign that you should be using a database instead of the filesystem.
  • Python’s multiprocessing package, and the Pool class in particular, are pretty easy to use and very effective.
  • As ever, PostgreSQL is just an amazing bit of software. (Also PostGIS and GeoDjango.)
  • QGIS is so powerful, and so fun to use!
  • AWS billing alerts are harder to set up than they should be, but it’s very important to do that for personal projects where you need a lot of CPU – accidentally leaving a powerful EC2 instance running can cost you a lot.


You may use this data under the Open Government License v3.0, and must include the following copyright notices:

  • Contains OS data © Crown copyright and database right 2020
  • Contains Royal Mail data © Royal Mail copyright and database right 2020
  • Source: Office for National Statistics licensed under the Open Government Licence v.3.0

Source code

The code to generate this data can be found on GitHub.


You can find a list of changes in versions of this data on GitHub.

Some enjoyable things from 2017

2017 was a pretty crap year in lots of ways; I tried to write quite a broad review of what went well and badly from my point of view in order to take stock, but that very rapidly sprawled, so I’ve cut that draft down to this post, which is about mostly about cultural things that I enjoyed in the last year.


There are few things that give me so much pleasure as cinema, and so it’s rather disappointing that I didn’t see many new films in 2017. (I’ve been keeping a list of things I want to see, and added more than 50 films over the course of that year…) However, of the things I did see, my favourite films released this year were:

  • My Life as a Courgette
  • Get Out
  • Moonlight
  • The Red Turtle

They’re all amazing (in very different ways!) And I also really enjoyed:

  • The Last Jedi
  • The Lego Batman movie (which is up there with Tim Burton’s Batman and Batman returns as my favourite Batman movies. I’m happy to argue about the Christopher Nolan ones any time ;))
  • Hidden Figures
  • Arrival
  • Wonder Woman

The best film I saw for the first time this year, but which was released before 2017, was:

  • Kill List – just an extraordinary movie (though not for the faint-hearted, I should say).

The biggest cinematic disappointments for me last year were:

  • Blade Runner 2049  – beautifully made, but …  Jamie Zawinski’s review covers a lot of my problems with it, although I think there’s much more to be said about how thoughtless and exploitative the Joi storyline was.
  • La La Land – again, wonderful cinematography, but the song and dance numbers left me strangely cold, and the character of Seb (Ryan Gosling) behaved like an arse at almost every point in the movie, so I found it difficult to feel involved in the central love story.

Most re-read film criticism:

Favourite film podcast:


There’s so much amazing television around at the moment, I can’t keep up at all. For what it’s worth, though, the favourite new things I saw this year were:

  • Bojack Horseman, particularly season 4
  • The Good Place

(Both are on Netflix.) The series I’ve most re-watched this year are:

  • Brooklyn Nine Nine
  • Community

They’re both so full of joy for me, and frustratingly I seem to be really bad at getting across to people why. (Random aside: I wish people wouldn’t put so much emphasis on the paintball episodes of Community – they’re fine, but basically nothing to with what makes the series so great overall.)


It’s been a while since I wrote my last summary of interesting things I’ve been reading, and I’d like to do another one soon. However, just to pick out a couple of recent highlights, I’ve probably got most enjoyment recently out of re-reading Jane Austen – in particular Persuasion and Sense and Sensibility. On that subject, John Mullan’s “What Matters in Jane Austen” was great to read around the same time.


Although it might be barely detectable, I’ve been trying to write a bit more on this blog, publishing 7 posts last year, up from just 3 in 2016. Of these, the one that took by far the most work was on postcode boundaries. Probably the most delightful thing that happened related to this blog is that someone (@columbophile on Twitter) actually spotted an error in my silly Columbo data visualization and tweeted about it after I fixed the mistake :)

I do enjoy writing here, and I’m trying to do it more. I have a long-planned post about Python’s mock library in the works, which is probably top of my list. And similarly I’ve been thinking about a long post about The West Wing for ages, which I’d really like to get done.

Board Games

The highlight of the year for me for board games was a lovely weekend away with some friends to play some games.

The game I’ve most enjoyed this year is:

It’s interesting strategically, but doesn’t feel stressful to play, perhaps because of the lovely theme. And I love games that involve tiling polyominoes anyway.

I’ve also really enjoyed playing these:

Video Games

I haven’t played many computer games this year, but one of those that I did was one of the best games I’ve played in a long time (probably my favourite since Journey) – that was:

I also had a lot of fun playing Behold the Kickmen too :)



I’ve been doing a lot of indoor bouldering this year and I’ve been really enjoying it – it clears my head, it’s fantastic exercise and makes you feel like a superhero :) I’m climbing at about V3 to V4 standard at the  moment, I think. I hope that by working more on lower body flexibility and losing some body fat I can improve that somewhat next year.

I also did some more indoor lead climbing this year, learned to belay with a Gri Gri, and I’m hoping to start doing some outdoor climbing over the next year.

I’ve been mostly climbing at Vauxwall, which is a nice (and very conveniently located) bouldering wall in central London. They run a friendly competition every two months called “Vauxcomp”, which I’d recommend, with excellent music and great bouldering problems.


I haven’t done any big races this year (like another half marathon, or my first full marathon) but I’ve been running pretty steadily. I’m still doing parkrun every week I can, which is still a great way to keep at it and it’s nice to know some of the regulars a bit now. Injury-wise I’ve had a mixed year – I had a painful calf muscle problem which put me out for a while, but I’ve not had a relapse of that since starting doing these stretching exercises, kindly recommend by a colleague, before running.

My left knee has been hurting quite a bit about the 8km point of any run recently, and I suspect I’ll have to consult a physiotherapist about that.

I’ve started using Strava to track my running, and I’m quite liking the light social aspects of that.

Two weeks off

I’m just coming to the end of two weeks off work – not for travelling anywhere, but just time spent mostly at home trying to relax and think about what I’m doing a bit more calmly.

This year has been a pretty stressful one at work, with me (unusually) having to work long days and some weekends, but we’re in a period of respite for the moment, and my lovely colleagues had been strongly encouraging me to take some proper time off (rather than just odd days here and there). One of the amazing things about working at mySociety is the organisation’s real commitment to discouraging people from working over 37½ hours a week. Sometimes it has to happen, though, as it has for me this summer, but most of us keep time-sheets (in Freckle) and at the end of each quarter, we get an email saying how much over that we’ve worked, with a reminder to take it as TOIL (Time Off In Lieu), which built up into quite a bit of holiday time for me.

We couldn’t go away anywhere, because Jenny couldn’t take time off, but I think this was for the best – I find travel stressful at the best of times, and everything I wanted to do was probably easier from home anyway.

Here are some of the things I’ve been doing over the last few weeks:

  • Sleeping, lots of sleeping. I don’t just mean sleeping-in in the morning , but also about an hour or so during the day. It’s amazing: I highly recommend sleep.
  • Doing some writing: some of this was private (more journal-like material) but also a couple of blog posts I’d been trying to get done for a while on The Great British Bake Off and learning SPARQL for Wikidata. I didn’t get as many public blog posts done  as I’d hoped, but it’s good to be reminded that any post with technical content does take a long time, I suppose.
  • Exercise: running and climbing a bit more than I’d been able to recently. I also took some “bouldering selfie videos”, which I put on Instagram, even if that feels a bit weird.
  • I had a lovely afternoon and evening of board games with a couple of friends, which gave us a chance to do a slightly heavier strategy game than we normally have time for. In this case, that was Viticulture, but we also got in a game of Roll for the Galaxy and Star Realms. I didn’t do very well in Viticulture, but I think that’s to be expected the first time you play such a complex worker-placement game. The amount of time it took to play (I think our game was a whopping 4½ hours) is a bit off-putting, but I hope I’ll get another chance to try it.
  • We went up to Cambridge for a Christmas dinner with some friends at a Cambridge college, which was a very pleasant evening.
  • Reading: it was really great to have some relaxed and extended time for some reading; normally it feels like I’m just grabbing odd 15 minute periods here and there for reading. A couple of the things I was reading were about addiction and dependence, which has been thought-provoking and helpful as I’m trying to be more introspective about how I drink in particular. (I’d been teetotal for nearly two years, and I’m now having some alcohol again in situations where I’m pretty sure it’s not going to be unhealthy, and I still want to understand this all better.) These books were:
    • “Chasing the Scream” by Johann Hari. This is about all sides of the drug war, and makes a compelling case of decriminalisation and legalisation. I’m not wild about how it’s written, but it’s a very effective argument nonetheless.
    • “Her Best Kept Secret: Why Women Drink-And How They Can Regain Control” by Gabrielle Glaser. I read this on the strength of the author’s Atlantic article about Alcoholics Anonymous, which challenges the idea that 12-step programmes are the best way to treat problem drinking. This book does have more on the problems with AA, how wine was marketed to the US, and how women’s drinking has changed over the last century.
  • I had a violin lesson for the first time in about 20 years, since I had an orchestra concert coming up. This was helpful – I’m wary of making a regular commitment to having lessons because I have so little free time for practice (or at least time to practice when it wouldn’t disturb people), but having a lesson every now and then would help me, I think.
  • Played in a Chamber Academy Orchestra concert, which was made up of the Beethoven Egmont Overture, Mozart’s Double Piano Concerto No. 10 and Dvořák’s Symphony No. 9 (The New World Symphony). I still have lots of nerves around performing music, but I enjoyed this one.
  • Caught up on quite a lot of TV that people had been recommending for ages:
    • Bojack Horseman: starts pretty good and just gets better and better – season 4 in particular has some amazing, dark episodes.
    • Star Trek: Discovery: – it’s pretty variable at the start (and I do find the new Klingons quite dull) but there are a couple of episodes towards the end of the first nine (all that’s out so far) which I think are up there with my favourite Star Trek episodes from any series.
    • Rick and Morty: very mixed feelings about this; I find some of the voice acting (particularly Rick and Morty) just irritating, a lot of the humour is pretty juvenile, and I hate all Rick’s belching and drooling. At the same time, I like the science fiction side of it, the fourth-wall breaking jokes, and so on. The shoutiness of it gets quite wearing if you’re binge-watching.
    • The Good Place: I actually finished watching this before I started holiday, but in my head it’s grouped with these “recommendations from other people I’ve watched recently” – anyway, it’s excellent, and you should watch it :)

More broadly, I think it was good for me to have a bit of time to be more reflective about what I’m doing with life both in work and more generally. It wasn’t quite as calm as I’d have liked for this, but a lot better than the previous 4 months…

(The title of this post is a reference to one of my favourite Underworld songs: Two Months Off.)

Why should you learn SPARQL? Wikidata!

Why learn SPARQL now?

I think that SPARQL has something of a bad reputation in the open data community – my impression is that that came about because when organisations published data by making a SPARQL endpoint accessible, people had problems like:

  • Many SPARQL endpoints not working reliably
  • Writing queries that joined data between different endpoints never quite worked
  • The very variable quality of data sources
  • People just wanted CSV files, which they already know how to deal with, not an endpoint they had to learn a query language to use

The last one of these in particular I think was particularly important: people who just wanted to get data they could use felt they were being sold semantic web tech that they didn’t need, and that was getting in the way of more straightforward ways of accessing the data they needed.

However, nowadays I think there is a compelling reason to have a good command of SPARQL, and that is: the Wikidata project.

What is Wikidata?

Wikidata is an amazing project, with an ambitious goal on the scale of OpenStreetMap or Wikipedia: to provide a structured database of the world’s knowledge maintained by volunteers (and, like Wikipedia, anyone can edit that data).

One of the motivations for the Wikidata project is that lots of data that you see on Wikipedia pages, particularly in the infoboxes, is just stored as text, not structured data. This creates the strange situation where data like the population of a city might be different on the German and French Wikipedia and each language community maintains that independently – a massive duplication of effort. Some of these infoboxes are now being migrated to be generated from Wikidata, which is a huge win, particularly for the smaller Wikipedia communities (e.g. Welsh Wikipedia has a much smaller community than Spanish Wikipedia so benefits a lot from being able to concentrate on things other than redundantly updating data).

However, because of its potential as a repository for structured data about anything notable, the project has become much more than just a way of providing data for Wikipedia info boxes, or sitelinks – data in Wikidata doesn’t necessarily appear in Wikipedia at all.

A project that I’m involved in at work, for example, is writing tools to help to get data about all the world’s politicians into Wikidata. This effort grew out of the EveryPolitician project to ensure its long term sustainability.

Why might Wikidata change your perception of SPARQL?

The standard way to run queries against Wikidata nowadays is to write SPARQL queries using the Wikidata Query Service.

Going back to those points at the top of this post about why people might have felt that there was no point in learning SPARQL, I don’t think they apply in the same way when you’re talking about SPARQL specifically for querying Wikidata:

  • The Wikidata Query Service seems to work very reliably. We’ve been using it heavily at work, and I’ve been happy enough with its reliability to launch a service based on making live queries against it.
  • Because of the extraordinarily ambitious scope of Wikidata, there’s no real reason to make queries across different SPARQL endpoints – you just use the Wikidata Query Service, and deal entirely with information in Wikidata. (This also means you can basically ignore anything in generic SPARQL tutorials about namespaces, PREFIX or federation, which simplifies learning it a lot.)
  • Data quality varies a lot between different subjects in Wikidata (e.g. data about genes is very high quality, about politicians less so – so far!) but it’s getting better all the time, and you can always help to immediately fix errors or gaps in the data when you find them, unlike with other many other data projects. For lots of things you might want to do, though, the data is already good enough.
  • Lots of open data that governments, for example, release has a naturally tabular structure, so the question “why should I learn SPARQL just to get a CSV file?” is completely valid, but Wikidata’s data model is (like the real world) very far from being naturally tabular – it’s a highly interconnected graph, supporting multiple claims about particular facts, with references, qualifiers and so on. If you all you want is a CSV file, you (or someone else) can write some SPARQL that you can use in a URL that returns a CSV file, but you wouldn’t want to try to cope with all of Wikidata as a CSV file – or even all the information about a particular person that way.

Some motivating examples

You can find some fun examples of the kinds of things you can query with SPARQL in Wikidata at the WikidataFacts Twitter account, e.g.

These examples also demonstrate a couple of other things:

  • There are some nice ways of visualizing results built into the Wikidata Query Service, like timelines, graphs, images, maps, etc.
  • The musical instruments example is clearly incomplete and likely biased by which collections of art are best represented in Wikidata; lots of queries you might make are like this – they’re interesting despite being incomplete, and the data will only get better over time. (Also, looking at the results can give you ways of thinking about how best to improve the data – e.g. what’s the biggest missing source?)

As an example of something I made that uses Wikidata (as a fun holiday project) here’s a site that can suggest a random episode of your favourite TV programme to watch (there are links at the bottom of the page to see the queries used in generating each page) and suggests ways of improving the data.

Finally, here’s a silly query I just wrote – I knew that Catmando was unusual in being a cat that was a leader of a political party, but wondered if there are other animals that have an occupation of “politician”. It turns out that there are! (click the “play” icon to run the query and see the results).

Great! How do I get started?

There’s an excellent tutorial on using SPARQL to extract data from Wikidata. I’d definitely recommend starting there, rather than more generic SPARQL tutorials, those others will tell you lots of things you don’t need to know for Wikidata, and miss out lots of useful practical tips. Also, the examples are all ones you can try or play with directly from the tutorial.

It’s a long tutorial, but you’ll get a lot of out of it even if you don’t got through the whole thing.

Tips that helped me to write SPARQL more effectively

The original idea of this post was to pass on some tips that helped me to write SPARQL better (which I mostly got from Tony Bowden and Lucas Werkmeister) – though it turns out that lots of these are in the tutorial I linked to above in some form or other! Nonetheless, I figure it might be useful to someone to reiterate or expand on some of these here. Some are quite basic, while others will probably only make sense after you’ve been writing SPARQL for a bit longer.

1. Use the Wikidata Query Service for writing queries

There are a couple of features of the Wikidata Query Service that mean it’s the best way I know of to start writing queries from scratch:

  • If you don’t know which property or item you want, you can follow “wdt:” or “wd:” with some plain English, and hit “ctrl-space” to autocomplete it.
  • If you mouse-over a property or item, the tooltip will give you a human readable description of it.

Both of these will reduce your reliance on post-it notes with property numbers :)

Of course, writing queries in the Wikidata Query Service form also means you can try them out just with a button press (or control-enter :)).

2. Use Reasonator or Squid for browsing relationships of an item

The item browser at is a bit limited: the main way in which this is frustrating is that the item pages only show claims that the item is the subject of. For example, the item page for the X-Files looks like:

… and you can’t get from there to the episodes of the series, or its seasons, since those are the objects of the ‘series’ and ‘part of’ predicates, not the subjects. The Reasonator and Squid pages do let you follow these relationships backwards, however:

(Also, those alternatives show images and other media related to the items, which is nice :))

3. Learn what wdt:P31/wdt:P279* means

Wikidata’s very flexible way of modelling data means that trying to find a particular “type of thing” can be complicated. For example, if you want to find all television series, then the following query:

  ?series wdt:P31 wd:Q5398426

(which asks for any item that is an ‘instance of’ (P31) the item ‘television series’ (Q5398426)) wouldn’t return The Simpsons, since that’s an instance of ‘animated series’ (Q581714) instead.

‘animated series’, however, is a ‘subclass of’ (P279) ‘television series’. This means that if you change the query to:

  ?series wdt:P31/wdt:P279* wd:Q5398426

… it will include The Simpsons. That new version of the query essentially asks for any item that is an instance of ‘television series’ or anything that is a ‘subclass of’ ‘television series’ if you keep following that ‘subclass of’ relationship up the hierarchy.

You’ll probably need to use this quite frequently. A more advanced variant that you might need is to use is:

  ?series p:P31/ps:P31/wdt:P279* wd:Q5398426

… which will consider an ‘instance of’ relationship even if there are multiple ‘instance of’s and another one has the ‘preferred’ rank. I’ve written more about the ‘p:’ and ‘ps:’ predicates below.

4. Add labels to your query results

Understanding if your query has worked properly or not is confusing if you’re just seeing a list of item numbers in your output – for instance in this query to find all episodes of season 1 of the West Wing:

SELECT ?episode WHERE {
  ?episode wdt:P361 wd:Q3730536

One way of improving this is to add variables with the “labels” of Wikidata items. The easiest way to do this in the Wikidata Query Service is to type “SERVICE” and hit control-space – the second option (beginning “SERVICE wikibase:label…”) will add a SERVICE clause like this:

SERVICE wikibase:label {
  bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".

With this clause added in to the WHERE clause, you’ll find that you have some extra variables you can include in those you’re SELECTing, which are named with a suffix of “Label” on the end of the existing variable name. So to see the names of those West Wing episodes, the full query would be:

SELECT ?episode ?episodeLabel WHERE {
  ?episode wdt:P361 wd:Q3730536 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".

This should be much easier to understand in your output.

Here are some more advanced notes on this tip:

  • The default languages of “[AUTO_LANGUAGE],en” won’t be what you want in every situation. For example, in queries for politicians from Argentina, it made more sense for us to use “es,en”, which is saying to use the Spanish label if present, or otherwise an English one.
  • You should be aware that Wikidata has more flexible ways of representing names of people than the labels attached to the person’s item.
  • Sometimes the “Label” suffixed variable name won’t be what you want, and you’ll want to customize the variable name. (For example, this might come up if you care about the column headers of CSV output, which are based on the variable names.) In these cases you could rename them using the rdfs:label property within the SERVICE clause. In the example above we could do this like:
SELECT ?episode ?episode_name WHERE {
  ?episode wdt:P361 wd:Q3730536 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
    ?episode rdfs:label ?episode_name.

5. Learn the helpful semi-colon, comma and square bracket syntax to abbreviate your queries

In your WHERE clauses, if you’re making lots of constraints with the same subject, you can avoid repeating it by using a semi-colon (;). There’s a nice example of this in the tutorial. With helpful indentation, this can make your queries shorter and easier to read.

That same section of the tutorial introduces two other shorthands that are sometimes very helpful:

  • a comma (,) to reduce repetition of the object in a constraint
  • using square brackets ([]) to substitute in the subject of another constraint

6. Understand the different predicate types

A big conceptual jump for me in writing SPARQL queries for Wikidata was that you can do quite a lot just using the “wdt:” predicates, but they only allow you to query a limited subset of the information in Wikidata. For lots of things you might want to query, you need to use predicates with other prefixes. To explain the problem, here’s an example of the data that a wdt:P69 (“educated at”) pattern for Douglas Adams extracts:

As you can see, there are two “educated at” statements for Douglas Adams, and the only information that a wdt:P69 pattern extracts are the items representing the institution itself. As you can see in that diagram, there’s lots more information associated with those statements, like references and qualifiers which you can’t get at just with a wdt: prefix.  A further limitation is that if there are multiple statements, the wdt: predicates only extract the most “truthy” statements – so if one of them is given preferred rank, that’s the only one that’ll match.

Fortunately, there are other predicate types that you can use to get everything you see in that diagram. Firstly, you need a pattern based on a p: predicate, which relates a subject to a statement value – the statement values are the bits in the orange boxes here:

Then you can use other predicate types like pq:, ps:, pr: and prov:wasDerivedFrom to access details within those boxes. You can read in more detail about those predicates in this page I wrote (that I made these diagrams are for).

To give an example (which might take some careful reading to understand!) suppose you want to find the all the seasons of The West Wing, with their season number. Each season is a ‘part of’ (P361) the series, and also has a ‘series’ (P179) relationship to the series. However, the number of the season within the series is only available as a ‘series ordinal’ (P1545) qualifier on the series statement, so you need to use the p:, ps: and pq: qualifiers like this:

SELECT ?season ?seasonLabel ?seasonNumber WHERE {
  ?season wdt:P361 wd:Q3577037 .
  ?season p:P179 ?seasonSeriesStatement .
  ?seasonSeriesStatement ps:P179 wd:Q3577037.
  ?seasonSeriesStatement pq:P1545 ?seasonNumber .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
} ORDER BY xsd:integer(?seasonNumber)

7. Use OPTIONAL when you don’t require a relationship to exist

Because lots of data in Wikidata is currently incomplete, you’ll often want to use OPTIONAL clauses in your WHERE query in situations where there’s extra information that might be helpful, but you don’t want its absence to prevent items from being returned.

Note also that you can nest OPTIONAL clauses, if you have OPTIONAL data that might depend on other OPTIONAL data.

For example, for TV programmes, the ‘series ordinal’ qualifier (on ‘series’ statements), provides both the number of a season and the number of the episode within a season. The latter is done much less consistently than the former, however, so you might want to make that part of your query optional. There are two parts that you might want to make optional:

  • Whether there’s a ‘series’ statement relating the episode to its season at all.
  • If there is such a ‘series’ statement, whether there’s a ‘series ordinal’ qualifier on it.

That’s a case where you might want to nest OPTIONAL blocks to make sure you get as much data as possible, even if this modelling is incomplete. For example, you could use this query:

SELECT ?episode ?episodeLabel ?season ?seasonNumber ?numberWithinSeason WHERE {
  ?episode wdt:P361 ?season .
  ?season p:P179 ?seasonSeriesStatement .
  ?seasonSeriesStatement ps:P179 wd:Q3577037 .
  ?seasonSeriesStatement pq:P1545 ?seasonNumber .
    ?episode p:P179 ?episodeSeasonSeriesStatement .
    ?episodeSeasonSeriesStatement ps:P179 ?season .
      ?episodeSeasonSeriesStatement pq:P1545 ?numberWithinSeason
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
} ORDER BY xsd:integer(?seasonNumber) xsd:integer(?numberWithinSeason)

(In that example, all the data is now present so the OPTIONAL blocks aren’t necessary, but for other TV series where the episode number modelling isn’t complete, they would be.)

Great British Bake-Off Visualizations

For the last couple of years of The Great British Bake Off, I’ve been playing Fantasy Bake Off with colleagues and friends. It’s a fun and silly thing to do: each week you pick who you think will be star baker, who’ll come top of the technical challenge, and so on. The problem with this is that it’s quite hard to get any points at all, and this got me thinking about how to assess how well the contestants were doing over the course of the competition.

Collecting Data

The problem with this is that there’s not many quantitative measures available from the GBBO. However, I wasn’t going to let this stop me – as is probably clear from other entries in this blog, I feel that making spreadsheets about TV shows as you watch them is a perfectly respectable pastime, and since this is just for fun, I didn’t really care if there was a subjective element to the numbers I was putting into the spreadsheet.

The obvious measure to use of how contestants are doing is their position in the technical challenge each week, so I included that (12 points for first, 11 points for second, etc.) As well as that, after seeing what Paul and Mary/Prue said about the signature and show-stopper challenges, I’d try to give each bake a score from 1 (best) to 5 (worst) and scaled them so that each round had roughly the same weight. I also gave the star baker a few extra points.

The data that in retrospect I really should have included, but didn’t, were the bakers whom the judges described as being also in the running for star baker and elimination each week. That’s probably quite a strong signal, and it’s even helpfully recorded in Wikipedia’s helpful results summary table.

“Predicting” Results

This is the point at which I have to admit to some geek shame: at this point I clearly should have used some of the techniques I learned this year from Andrew Ng’s Machine Learning course on Coursera, to learn which combinations of features would best predict the star baker, technical winner and baker to be eliminated each week, but frankly I didn’t think of that until a point where I was already anxious to be finished with this whole silly project. (Also, given I only have data for 2016 and 2017 and the results seem pretty random, I’m not sure how well that would work anyway.) Maybe that’s a project for 2018, though.

I did combine these scores into an ad-hoc overall score, however, evaluating how well each contestant had done cumulatively (with some decay applied to earlier weeks) and picked my winners for Fantasy Bake Off from that. It’s hard to say how well this worked: for example, it didn’t work as well last year as the method of one of my colleagues, which was “ask their grandmother who would win each week”. Also this year I was beaten by a colleague who was many weeks behind in actually watching the show, but guessing each week anyway, and who amazingly managed to pick all three finalists in the first week. Nonetheless, my method was probably better than me just guessing based on my vague impressions of who was doing well at each point.

Visualization: bumps / horserace charts

I’d been really impressed by Anna Powell-Smith‘s visualization of the progress of the premier league over a season as a “bumps chart” or “horserace”, and wondered what my GBBO scores would look like if you plotted them in a similar way. My cumulative scorings up to and including each week could be considered a bit like points in a league, week-by-week. Would you be able to see the “character arc” of each contestant from this view of the data? I asked Anna if she would let me use her code to try this, and she kindly said yes, allowing me to make this:

Click on the image to go to a larger, interactive version

Getting that to work took a day or so, even with Anna’s source code, since it meant me learning more about D3, but I found that rather good fun.

However, coincidentally, Anna had recently started working at Flourish (a data visualization service that’s grown out of Kiln, whose work you will probably recognize) and she suggested that I try using their service to produce a similar visualization, using their “horserace” graph type. This turned out to be pleasingly easy to do: the preloaded example shows you what rows and columns are expected in your uploaded CSV file in an obvious way, and the results are very nice, I think. (Although there’s the odd thing I couldn’t see how to do, like adding an indication on the lines of any point where someone got star baker.) For example, here’s the 2017 competition using my “subjective” measure:

Click on the image to go to a larger, interactive version

There are some nice things that jump out at you from this visualization – obviously I’m picking things that accord with my view of the competition, so you may well disagree!

  • Steven’s incredible start to the competition
  • Sophie’s very consistent excellence later in the competition
  • Kate being extraordinarily lucky not to have been eliminated week after week towards the end
  • Liam being eliminated was very harsh given his record in the weeks up to his elimination
  • Julia’s rise and fall was pretty dramatic
  • Peter did well enough in the first round (particularly in the technical) that even though he was eliminated my scoring put him above Chris, Flo and Tom (the next three to be eliminated)

For comparison, here is what the race looks like if you just consider the bakers’ placing in the technical challenge:

Click on the image to go to a larger, interactive version

I like that slightly different things jump out here: Steven doing relatively less well in the technical challenges, and the impressive performances of Sophie, Julia and James in the technical challenges.

I had similar data from 2016, so it might be worth talking a bit about that too. Here’s the graph that includes my subjective assessments for 2016:

Click on the image to go to a larger, interactive version

… and here’s the corresponding graph just based on technical challenge placings:

Click on the image to go to a larger, interactive version

The big things that I think jump out at me from these are:

  • Benjamina being eliminated in week 8 was really tough! She’d been so consistently good up to that point. I know it’s all about what happens in a given week (except for tie-breaking) but still…
  • Candice won, of course, but my subjective measures suggested Andrew had done very slightly better cumulatively over the competition. (Again, I know what happens in the final week is what really matters, and Candice was a worthy winner :))
  • Candice did consistently very well in the technical challenges

(I’ve put those four graphs into a Flourish “story”, which is perhaps a nicer way to go through them.)


  • For something that was meant to be a quick and silly hack, this ended up taking quite a lot of time.
  • I maintain that quantifying TV programmes with dubious methods in spreadsheets is fun and worthwhile :)
  • Flourish is really nice to use – I’m excited to see how that platform develops, and grateful that they gave me a beta account to try it out.
  • The Great British Bake Off is one of the greatest TV shows ever (but that probably should be the subject of a different post).
  • D3 is excellent, and I hope I’ll come up with some excuses to use it more in future.
  • Predicting results of the GBBO is difficult, and obviously futile in one way (you’re trying to predict something that’s already happened) but that doesn’t stop Fantasy Bake Off being entertaining. I recommend it for your workplace too.

Dealing with awkward subtitle problems in Handbrake

I know very little about Handbrake; this is just some notes on what I personally do to reduce my confusion about why subtitles aren’t being ripped properly and manually fixing that, but I almost certainly can’t answer any questions about issues you might be having! This is just here in the hope that it might be useful to someone…

I’ve been on a long-running project to rip some of our DVDs to network attached storage, so that playing them is a much more pleasant experience: we can then easily play any of our DVDs around the flat without suffering cute-but-unusable DVD menus, condescending anti-piracy warnings or trashy trailers. With some DVDs that are already quite poor quality I’ll just image the disk and copy the ISO to the NAS, but in most cases I use Handbrake to transcode just the titles I want from the DVD using a codec with rather better compression than MPEG-2. There’s inevitably a loss of quality doing this transcoding, of course, but in most cases I don’t mind.

DVD Subtitles

One slightly vexing issue that sometimes comes up when doing this is what to do about the subtitle tracks on the DVD. For plenty of films or TV shows there aren’t any points at which you’d expect a subtitle to be shown, so you don’t need to worry about them. However, plenty of our DVDs do have brief sections in foreign languages, for example, that you’d expect to be subtitled.

There are various choices about how you can handle these when ripping, but I prefer what’s really the option that loses most information: “burning in” any subtitles that are there for the purposes of translating foreign language sections into English or are added for comic effect. “Burning in” these images from a subtitle track means that you’re overlaying the images before the source video is transcoded, so you can’t remove those subtitles or switch them to other languages afterwards. Obviously lots of people prefer not to burn in the subtitles for that reason, but I tend to do this because it means I’m not subject to the vagaries of how different video players handle subtitles, and these rips are only for my own personal use anyway.

As well as “burning in” subtitles, another concept you might need to know about is of “forced subtitles”. Any subtitles in a given subtitle track might be marked as “forced” – this is used to tell the DVD player to display these even if the person hasn’t chosen to see all subtitles from a particular subtitle track – the intended use of this is exactly for brief sections of foreign languages in films, as I understand it.

In most cases, what works fine in Handbrake for what I want to do is to use the “Foreign Audio Search” option in the “Subtitle List” tab, selecting “Burned In” and then guessing whether to tick the “Forced Subtitles Only” box or not – generally I’ll leave that unchecked, unless when I look at the list of available subtitles (under “Add”) there’s only one English language subtitle track, in which case it’s probably forcing subtitles for any foreign language sections that are subtitled. This option should look through all the subtitle tracks for ones that appear less than 10% of the time, look for forced subtitles, and make a sensible choice about what to use: the Handbrake documentation explains this.

However, there are various ways this can go wrong – the “Foreign Audio Search” option sometimes makes the wrong choice in peculiar ways – e.g. I’ve seen it pick the right track when you’re just ripping one chapter from a title, but the wrong one when you’ve selected all the chapters (!). Also, there’s just very little consistency in how DVD producers choose whether to mark subtitles as forced.

When it goes wrong, here’s the method I use to manually pick the right subtitle track to burn in – essentially this is to do the “Foreign Audio Search” scan on just one chapter that I know both has audio that should and should not be subtitled, look through the “Activity Log” to see the results of that scan, and then manually select the right subtitle track based on that.


To step through that in more detail, here’s what I’d do:

  • Select from the “Subtitle List” tab the “Foreign Audio Search” option, and add that to the subtitles list. It doesn’t matter what options you choose, since we’re just adding this to get the results of the search into the activity log.
  • Find a chapter in the title you want to rip that has some audio you’d want to have subtitles for, and some that you don’t want to be subtitled. (You can select multiple chapters to achieve this if you want – the point of just choosing a small number of chapters is only to make the scan quicker.) I’d normally do this by (a) knowing a bit of the film with such audio and (b) finding that in bit of the film in Totem, which conveniently shows the title and chapter number in the window title.
  • Select just those chapters to rip,  add them to the queue
  • Start encoding
  • Open the activity log window
  • Once the “subtitle scan” pass has completed (it should be the first thing that Handbrake does) scroll up in the activity log to find the lines that look something like this:

[13:35:12] Subtitle track 0 (id 0x20bd) ‘English’: 87 hits (0 forced)
[13:35:12] Subtitle track 1 (id 0x21bd) ‘English’: 89 hits (0 forced)
[13:35:12] Subtitle track 2 (id 0x22bd) ‘English’: 6 hits (0 forced)

  • That means that subtitle track 2 is the right one, because there are subtitles for only some of the audio – the other two probably have subtitles for every line of dialogue, even if it’s in English. So, now we want to set up Handrake to rip the complete title but with that subtitle track added manually:
  • Remove the “Foreign Audio Search” option from the subtitle list.
  • Click “Add”  in the subtitle list tab, and select subtitle track 3 (n.b. not 2, since the graphical user interface (GUI) for Handbrake numbers subtitle tracks starting at 1, not 0.) Make sure you don’t select “Forced Subtitles Only” since the subtitles on that track aren’t forced (see “0 forced” in the output above).  (I would also select “Burn in” whenever adding a subtitle track manually like this, for the reasons discussed above – but you might well have different views about that.)
  • Then select all the chapters of the title, add the title to the queue and rip it as normal.

As other examples of things you might see in that list of “hits” in the results of the foreign audio search pass, consider the following:

[16:57:06] Subtitle track 0 (id 0x20bd) ‘English’: 77 hits (0 forced)
[16:57:06] Subtitle track 1 (id 0x21bd) ‘English’: 78 hits (0 forced)
[16:57:06] Subtitle track 14 (id 0x2ebd) ‘English’: 8 hits (8 forced)

In this example, it happens that the subtitles are “forced”, but it’s still clear that track 14 (0-based) is the one which just has the subtitles for the foreign language section, so I’d add track 15 (14 +1) manually in the GUI as above – and in this case it doesn’t matter whether you selected “Forced Subtitles Only” or not, since all of the subtitles on that track appear to be forced.

As a final example, you might see output like this:

[18:23:55] Subtitle track 0 (id 0x20bd) ‘English’: 92 hits (11 forced)

In that example there’s a single English subtitle track, which marks some subtitles as “forced” to indicate that those are for foreign language audio. In that case, in the GUI I would manually add subtitle track 1 (0 + 1) but would have to select the “Force Subtitles Only” option to avoid getting subtitles for everything.

Approximate UK postcode boundaries from the Voronoi diagram of ONSPD

This work has been superceded by a new dataset of postcode boundaries I’ve since made – this post is here largely for historical interest as a result, but the data may still be useful because it includes some (very approximate) postcode polygons for Northern Ireland, which I didn’t have data for in the new approach.

TL;DR: you can try entering a postcode here and click through to see the very approximate boundaries of the postcode unit, sector, district and area around there.

This project started from me being curious about some simple questions – what does the boundary look like of all the houses with the same postcode as us? How much of our street does it cover? How much bigger would the boundary of your postcode be if you lived somewhere much more rural?

The problem with answering these questions properly in general (i.e. make it easy for anyone to find out) is that it would be incredibly expensive to do so. There are many underlying reasons for this, but essentially it comes to down to the fact that you need the latitude, longitude and postcode of every building in the UK. The only dataset which has this information is Ordnance Survey’s Address-Base product, which has wretched licensing terms. Even if I had £130,000 a year (pricing here) to spend on this, I wouldn’t be able to share the results with people due to the licensing, which is much of the point.

(Although the reasons I was interested in this originally are a bit frivolous, it really is a long-running scandal that this address database isn’t open – Sym wrote about one of the reasons why this matters on the Democracy Club blog and this case study about the Open Addresses project from the ODI gives you lots of good background about the problem, including the huge economic benefits for the country you could expect from this data being made open.)

Anyway, unfortunately, this means that I’ll have to settle for answering these questions imprecisely, using data that is open. A good starting point for this is the ONSPD (the Office of National Statistics Postcode Directory), which contains a CSV file with every postcode in the UK, and, for most of them, it has the latitude and longitude of the centroid of that postcode.

What I wanted to do, essentially, is to find, for each postcode centroid, the boundary of all the points that are closer to that centroid than the centroid of any other postcode. In mathematical terms, we want the Voronoi diagram of the postcode centroids, and we can calculate that with Python’s matplotlib by generating the Delaunay Triangulation of the points with matplotlib.delaunay. (Delaunay triangulation is a closely related geometric concept, from which you can derive the Voronoi diagram.)

That’s not the whole story, however, since we have to think about what happens around the edges of this cloud of postcode points. For example, here is the Voronoi diagram of just the postcode centroids in TD15 (Berwick-upon-Tweed):

The most obvious features there are probably the big spikes out to sea and to the south-east, but they actually aren’t anything to worry about: it’s just that the outermost postcode centroids around the coast at Berwick-upon-Tweed are concave, which produces big circumcircles from the Delaunay triangulation and so large triangles in the Voronoi diagram. Instead, the problem is the postcode centroid I’ve highlighted with a red arrow in the south. This point isn’t actually contained in any of the polygons in the generated Voronoi diagram. I don’t want to risk this happening around the edges of the cloud of postcode points, so before calculating the Voronoi diagram I’ve added 200 points in a very large diameter circle around the true postcode points. These “points at infinity” mean that each point around the edge is definitely contained in a polygon in the Voronoi diagram. For example, if we do the same with the Berwick-upon-Tweed example, you instead get a diagram like this:

I’ve highlighted the same postcode centroid with a red arrow, and you can see that this means it’s now contained in a polygon.

Here’s an example of the polygons you get for the postcodes in SW2 1 which also shows includes these “points at infinity”:

This does mean that when you run this process for the whole UK, you might end up with massive postcode polygons around the coasts, so the script checks if any of the polygons points might lie outside the boundary of the UK (taken from OpenStreetMap) and if so clips that polygon to that boundary. (That boundary is a little way out to sea—you can see it as the purple line in the picture above—but it’s the most readily available good boundary data for the whole country I could find.)

Another inconvenience we have to deal with is that there are multiple postcodes for a single point in some cases. (One of the most famous examples is the DVLA in Swansea, which has 11 postcodes for one building. That pales in comparison to the Royal Mail Birmingham Mail Centre, though, that appears to have 411 postcodes.) We can’t have duplicate points when constructing the Voronoi diagram, so the script preserves a mapping from point to postcodes so we can work out later which polygon corresponds to which postcodes.

One other thing that made this slightly more complicated is that the latitudes and longitudes in ONSPD are for the WGS 84 coordinate system¹ – if you generate the Voronoi diagram from these coordinates directly,  you end up with polygons that are elongated, since lines of longitude converge as you go further north and we’re far from the equator. To avoid this, the script transforms coordinates onto the Ordnance Survey National Grid before calculating the Voronoi diagram and transforms the coordinates back to WGS 84 before generating the KML file. This reduces that distortion a lot, although the grid of course is rather off for the western parts of Northern Ireland.

¹ Correction: Matthew pointed out that ONSPD does have columns with eastings and northings as well as WGS 84 coordinates, so I could have avoided the first transformation.

A last stage of post-processing is to take the shapes around these individual postcode centroids and agglomerate them into higher level groupings of postcodes, in particular:

  • The postcode area, e.g. EH
  • The postcode district, e.g. EH8
  • The postcode sector, e.g. EH8 9

It’s nice to be able to see these higher level areas too, so this extra step seemed worth doing, e.g. here are the postcode areas for the whole country:

Anyway, my script for generating the original postcode polygons takes a few hours (there are about 2.6 million postcodes in the ONSPD database), which could certainly be sped up a lot, but doesn’t bother me too much since I’d only really need to update it on a new release of ONSPD. And this was just a fun thing to do anyway. (It was meant to be a “quick hack”, but I can’t really call it that given the amount of time it’s taken.)

One small note about that is that I hadn’t really used Python’s multiprocessing library before, but it made it very easy to create a pool of as many worker processes as I have cores on my laptop to speed up the generation. This can be nicely combined with the tqdm package for generating progress meters, as you can see here.

The results are somehow very satisfying to me – seeing the tessellation of the polygons, particularly in areas with widely varying population density is rather beautiful.

Get this data

If you just want to look up a single postcode, you can put your postcode into a MapIt instance I set up with all these boundaries loaded into it.

That MapIt instance also provides a helpful API for getting the data – see the details on the front page.

If you just want an archive with all 1.7 million KML files, you can also download that.

Please let me know in the comments if you make anything fun with these!

Related projects


Of course, I’m not the only person to have done this. I found out after working on this on and off in my spare time for a while that there’s a company called Geolytix who have taken a similar approach to generating postcode sector boundaries, but who used real-world features (e.g. roads, railways) close to the inferred boundaries and adjusted the boundaries to match. There’s more about that in this blog post of theirs:

Postal Sector Boundaries by Geolytix

They’ve released those boundaries (down to the postcode sector level) as open data as a snapshot in 2012, but are charging for updates, as explained there.

The results look rather good! Personally, I’m more interested in the fine grained postcode boundaries (below the postcode sector level), which aren’t included in that output, but it’s nice to see this approach being taken a big step further than I have.

OS Code-Point Polygons

The Ordnance Survey, every civic tech developer’s favourite vampire squid, do sell postcode polygons that are as presumably as good as you can get. (It sounds from the Geolytix blog post above that they are generated as the Voronoi diagram of Address-Base points, which is what I’d ideally like to do myself – i.e. just run this script on all the addresses in Address-Base.) You can see that here:

Naturally, because it’s the Ordnance Survey, this product is also very expensive and has crappy licensing terms.

What’s next for this project?

I’m not sure if I’ll do any more on this, since it’s been much more of a time sink than I’d hoped already, but if I were to take it further then things I have in mind are:

  • Get a better coastline polygon for the UK. The OSM administrative boundary for the UK was very convenient to use, but because it extends out into the sea some distance, it makes areas around the coast bulge out and that means that you can’t really compare the areas of postcode shapes, which is one of the things I was interested in. You could create a better polygon as a union of some of the shapes in Boundary-Line, for example.
  • Adding other sources of postcode points, e.g. OpenAddresses – although the project is in hibernation, I hoped I’d be able to use the addresses they’d already collected, but the bulk downloads don’t seem to include coordinates. I might be missing something obvious, but I haven’t heard back from emailing them about that.
  • It would be nice to make a Leaflet-based viewer for scrolling around the country and exploring the data. (For viewing multiple areas I’ve been loading them into QGIS, but it would be nice to have a web-based alternative.)

Source code and recreating the data

Source code

You can clone the source for this project from:

If you want to generate this data yourself you can follow the instructions in the README.

When does Columbo first appear in each episode?

I made the graph below for a short talk I gave about the TV series “Columbo”, which I think is a marvellous programme. The point was to depict the typical structure of the show for people who didn’t know it: the substantial blue section of each bar (before Columbo first appears) usually just consists of the murderer going about the crime, so you know exactly how it was committed and by whom. The red section of the bar is the part where you see Columbo trying to figure everything out. (People sometimes say this is makes it a “howcatchem”, not a “whodunnit”.)

A graph showing when Columbo first appears in each episode
A graph showing when Columbo first appears in each episode. You can click through to a larger version in which you can see which episode is which.

Anyway, I thought it might be nice to put the graph online, which is most of the point of this post, but thought I might add a few notes about why I like Columbo so much.

Update (2017-10-22): I’ve fixed an error in Columbo’s first appearance time for “Candidate for Crime”, kindly spotted and pointed out by @columbophile!

I think Columbo suffers rather from so many people remembering its clichés (e.g. his basset hound (“Dog”), the cigars, the raincoat, “just one more thing”; his never-seen wife) but not necessarily the things that made it a powerful show.

For me perhaps the most important of those is the way that class, power and wealth are treated on the show. There’s a joke on a Columbo podcast I sometimes listen to that he must work in a special department of the police force where he only deals with crimes of the very wealthy and privileged. And indeed, the privilege of the murderers is frequently emphasized by Columbo; in one scene he literally tots up the value of someone’s house and possessions and works out with them how many years he’d have to work to afford them – and yet this is without bitterness, and he never drops the persona of the enthusiastic and respectful fan.

The house from "Étude in Black"
This is the kind of house a conductor lives in, in the world of Columbo

The awfulness of the murderers in Columbo is compounded by this dynamic; they consistently have an extraordinary sense of entitlement, believing they can get away with anything because of their position in society, and have absolutely no regard for the apparently down-at-heel detective. (This makes the inevitable point where they realise how badly they underestimated him all the sweeter, of course.)

I think this partly resonates with me so much because, while I’m undoubtedly incredibly fortunate in my situation in life, my job is to work on projects that try to increase the transparency of institutions that affect our lives and try to hold people in power to account. This is difficult, frustrating work, and it often seems (particularly in the political climate at the moment) that the successes are overwhelmed by the setbacks.

But the world of Columbo isn’t like that: it’s a fantasy world in which someone who appears powerless succeeds 100% of the time in holding the powerful to account – no matter how rich they are, or how well connected, or even if they’re his own deputy police commissioner. And he does that all while being respectful, polite and enthusiastic.¹

I could write a lot more about what makes the show important to me, why I admire the character of Columbo, and the various flaws I perceive in the show, but I’m trying to get some blog posts actually finished at the moment, so I’ll just leave you with some photos of Peter Falk throughout the 35 years that he played Columbo , and a list of the occupations of the murderers in the series.

¹ If I remember right, Zack Handlen made this point well on the Just One More Thing podcast.

With the help of Wikipedia I made this list of the occupations of the murderers on Columbo. (Note that quite a few of them are “[occupation] to the stars!” where there wouldn’t otherwise necessarily be the required difference in social status :))

  • Psychiatrist
  • Attorney
  • Best-selling Novelist
  • Head of a detective agency
  • Retired Major General in the Army
  • Art critic
  • Independently wealthy (heiress)
  • Chemist / heir to a chemical company
  • Architect
  • Orchestral Conductor
  • (Unclear – retired?)
  • General manager of an American football team
  • Famous stage actors
  • Movie star
  • Cardiac surgeon
  • Chess grandmaster
  • TV chef / banker
  • CEO of a cosmetics company
  • Head of a winery
  • Senatorial candidate
  • (Subliminal) Advertising expert
  • Book publisher
  • Director of a think tank
  • Famous singer
  • Deputy police commissioner
  • Owner of a chain of gyms / fitness expert
  • Photographer
  • Head of a military academy
  • Auto executive
  • President of an electronics company
  • Psychiatrist
  • Movie star
  • Diplomat
  • CIA agent
  • Famous matador
  • Magician and club owner
  • Heir to a boat-building firm (? Fred Draper)
  • TV detective
  • Museum curator (?) in the family museum
  • Accountant (senior partner)
  • Mystery writer
  • Restaurant critic
  • TV executive
  • Mind control guru
  • Arms dealer / poet / raconteur / terrorist
  • (Fake) Psychic
  • Hollywood film director
  • Sex therapist
  • Head of a paramilitary school for mercenaries / retired colonel / head of a think tank
  • Famous artist
  • Magazine publisher
  • Political operative (previously lawyer with the DA)
  • Real estate executive
  • “Dentist to the stars”
  • Gigolo
  • Spoilt frat boys
  • TV host
  • Lawyer
  • Jeweller
  • (No murder, but the kidnapper was an ambulance driver)
  • Gambler
  • Wealthy socialite
  • Radio host
  • Insurance investigator
  • Thoroughbred ranch owner
  • Crime scene investigator
  • “Funeral director to the stars”
  • Film composer and conductor
  • Nightclub promoter

Books (2014 to 2016 ish)

Someone (rather surprisingly!) mentioned they’d enjoyed my last post about books I’d been reading, and would be interested in another one, hence this post.

Lots of these were recommendations from people I know, which I always hugely appreciate, but I haven’t attempted to note who recommended what below, partly since I’m not sure people would be happy with those recommendations being public.

(The links below are affiliate links to Amazon UK, in case that concerns you. Although I don’t know why I bother, really – I think such links have made me about £2 in total over the last year.)

“A Little Life” by Hanya Yanagihara

This is a long, brilliantly written and deeply upsetting novel. I cried more during this book than any I can remember in a long time, and had to stop reading often.

I have some reservations about this book, but on the whole it was an incredible experience to read. You should be warned, though, that it has descriptions of horrifying abuse in it.

“Utopia for Realists” by Rutger Bregman

This is an argument for a series of radical progressive policy ideas, including:

  • Universal basic income
  • Shorter working weeks
  • Open borders

It’s a bit of a polemic; it doesn’t really address some of the issues with universal basic income, for example, such as that some people in society do need more support from the state from others, and how you address that. However, it’s thought-provoking and it’s a good source of references to places where these policies have been tried. (e.g. I didn’t know that Richard Nixon was close to passing something very like a universal basic income in the USA.)

“The Taming of the Queen” by Philippa Gregory

This is one of Philippa Gregory’s novels about the Tudor period, specifically about Kateryn Parr, the last wife of Henry VIII. I had mixed feelings about this; after reading Hilary Mantel, the writing seemed a bit flat, and the basal exposition in dialogue got repetitive. It’s very tense, though, and a fascinating story that I knew nothing about.

“Command and Control” by Eric Schlosser

I found this book about the safety of nuclear weapons since their earliest development completely gripping, and very alarming. It interleaves the broader history with a detailed account of one particular incident, which worked very well, I thought.

“Hooves Above the Waves” by Laura Clay

(Disclosure: Laura is a friend.) I really enjoyed these dark and absorbing short stories. They also feature some creatures from Scottish mythology which I used to read stories about as a child, but hadn’t thought about for a long time. I’m looking forward to her next works.

“Bad Vibes” by Luke Haines

Subtitled “Britpop and my part in its downfall”, this is a bitter, angry and entertaining account by Luke Haines (of The Auteurs) of the Britpop years, how he disliked almost everyone else in that scene and none of them understood his genius, etc. etc. It works quite well as a companion piece / antidote to “The Last Party: Britpop, Blair and the Demise of English Rock”. I definitely found it more funny than objectionable, probably because I think it’s clear that he knows he’s an arse.

“The Stranger” / “The Outsider” by Albert Camus

I’d never read this before, but I think it’s always intriguing reading a book for the first time that you know the reputation of through popular culture. I can see why it’s so highly regarded, but the idea that lots of young men (apparently?) identify with the protagonist does upset me.

“Life Moves Pretty Fast” by Hadley Freeman

A joyful celebration of 80s movies, and a great source of recommendations for interesting films from that decade that I missed, or are worth rewatching.

I can’t agree about how highly she rates some of the films: for example, I like Ghostbusters, but it’s nowhere close to being my favourite film, as it is for her. Also, I loathe almost everything about “Ferris Bueller’s Day Off”. But that doesn’t matter at all, since she presents interesting, thought-provoking cases for all the films and the writing’s funny and involving throughout.

“Reasons to Stay Alive” by Matt Haig

This is an account of the author’s experience of clinical depression and recovery. (I’ve been treated for depression myself in the past and still struggle with it, but I’ve never experienced anything so severe as him.) There was a lot that I related to there, and it’s a quick, easy read and surprisingly uplifting read.

“Ancillary Justice”, “Ancillary Sword” and “Ancillary Mercy” by Ann Leckie

This science-fiction trilogy (“space opera” I guess) has been rightly lauded; the world-building is fantastic and I loved the story. The way it plays with your perceptions of gender and appearance are fascinating as well.

“Into Thin Air” by Jon Krakauer

I read this after watching the film “Everest”. I found it very absorbing; I know a bit about mountaineering in the sense of “climbing Munros” but the tensions and contradictions of expensive guided climbs in the Himalayas, at altitudes where the human body is effectively dying, was largely new to me.

“The Girl on the Train” by Paula Hawkins

As far as plot goes, I found this a bit predictable, but it was certainly gripping and the unreliable-due-to-alcoholism narrator worked well.

“Good Omens” by Terry Pratchett and Neil Gaiman

One of many re-reads – it’s still such a gem of a book, which has some of my favourite jokes in it.

“Saints of the Shadow Bible” by Ian Rankin

So long as Ian Rankin keeps writing Rebus novels, I’m going to keep reading them. I remember this being up to the standard of the later books (i.e. good :))

“Deep Work” by Cal Newport

This is about how important it is professionally to be able to regularly get into that lovely state of deep concentration so you can work on hard problems, even if it’s for relatively short periods of time each day (e.g. 3 or 4 hours). It’s hard for me to evaluate this book, really, because it plays exactly to my prejudices about what constitutes worthwhile work, which several people have told me are a bit broken :) However, I found it inspiring—it’s OK to stand up for this!—and its practical suggestions for, say, making email less time-consuming, were useful. I’d recommend it.

“Age of Ambition” by Evan Osnos

A very well written book about contemporary China. I learned a lot from it, and it’s very engagingly written. Highly recommended.

“Mansfield Park” and “Northanger Abbey” by Jane Austen

These are the two Jane Austen novels I know least well – I think I read them as a teenager, but couldn’t remember much about either. As far as Mansfield Park goes, I found Fanny Price pretty unsympathetic up to the point where everyone’s putting appalling pressure on her to marry Henry Crawford, at which point I found myself cheering her on. Northanger Abbey I loved too – particularly the awkwardness of trying to get to know people in Bath at the beginning. They’re both brilliant, obviously :)

The Peter Wimsey / Harriet Vane series books by Dorothy Sayers

Lord Peter Wimsey shares with Psmith the disconcerting contradiction in my mind as a reader that we must accept that these are clearly attractive men despite both wearing monocles. Anyway, my introduction to Dorothy Sayers was from the Nine Tailors and Five Red Herrings, but I think my favourite novels so far are those featuring Harriet Vane, and the journey of their relationship improving from the cringingly awful start in Strong Poison. Gaudy Night, in particularly, is brilliant, and I found that novel and Busman’s Honeymoon very moving in particular.

“Willpower: Rediscovering our Greatest Strength” by Roy F. Baumeister and John Tierney

It’s a bit curious writing about this book, because I’d have been very positive about it except for the issue that the research described in this popular science book is at the centre of the reproducibility crisis in psychology. The material about drawing “bright lines” when you’re trying to give something up, and the parts about how low blood sugar can really affect your emotional state and ability to make decisions rang very true to me.

“The KLF: Chaos, Magic and the Band who Burned a Million Pounds” by John Higgs

A brilliant account of the story of the KLF (so far) but it’s really about far more than just what they’ve done – it goes into the art movements that inspired or (may) relate to the band, and different ways of interpreting their bizarre story. It’s also very funny – I laughed out loud a lot when reading this.

“Stranger Than We Can Imagine” by John Higgs

By the same author as the previous book, this is an audacious journey through the 20th century, looking at how art and culture changed. I enjoyed it, and admire both the attempt and the writing – it’s great fun to read.

Terry Pratchett – Discworld novels

I’ve lost track of which Discworld novels I’ve re-read over this period, but it certainly included the Witches series. Anyway, they’re always so pleasurable to read – I’m really glad I started reading them again after such a long gap.

“Master and Commander” by Patrick O’Brian

I know lots of people who are big fans of Patrick O’Brian’s Aubrey / Maturin novels, but I’d never read any before this. I found it took a little while to get used to the idea that it was fine to not understand all the nautical terms, and just figure out roughly what’s going on from context – this is very similar to lots of science fiction, in fact. (I should say that I know some people get just as much enjoyment out of understanding every detail of these books as well.) Anyway, I’m planning to read more of the series, but got a bit stuck, for reasons I’m not sure of, part way through “Post Captain”. I’m sure I’ll go back to it, though.

“Danny the Champion of the World” by Roald Dahl

I loved this as a child, and it’s still marvellous. Perhaps the biggest shift in getting older is that I thought the father was wonderful when reading it as a child, and as a 40-year-old my thoughts were that he was, despite having many great qualities, unbelievably irresponsible.

“Nation” by Terry Pratchett

A marvellous standalone Terry Pratchett novel, which is funny and delightful. I think probably the less said about it the better, because right from the start there are things that are surprising, so arguably might constitute spoilers :)

“Anno Dracula” by Kim Newman

I don’t normally read horror fiction, so this was a bit of a departure for me. It’s by Kim Newman (whom you might know better from his film criticism) and set in a Victorian London some time after a key event in Dracula went differently, and now about a third of the population are vampires and Dracula is the Prince Consort. It was certainly disturbing, but very imaginative and packed with fictional characters from the period which reminded me of Alan Moore’s excellent League of Extraordinary Gentlemen comics. It’s really good.

“Dracula” by Bram Stoker

I think I re-read this after Anno Dracula. It’s still excellent, of course, but for some reason the thing that jumped out at me this time is that Van Helsing’s treatment (random blood transfusions from one person to another) were very likely to harm the recipient given this was before blood types were known about.

“The Knot Shop Man”: “Air”, “Fire”, “Earth” and “Water”

(Disclaimer: the author is a friend and colleague.) An amazing set of four books that you can read in any order, despite having a single narrative thread running through them all. These are in the finest tradition of dark and fantastic stories for children (sorry: “smart children or thoughtful adults”) with a huge amount to enjoy. (If you’re interested in knots, then you’ll like it even more.)

“The Fourth Revolution” by John Micklethwait and Adrian Wooldridge

I didn’t quite know what to make of this; the history of what they consider the previous revolutions in types of government was interesting to me, as was the cataloguing of the changes the world is going through. The conclusions, though, seemed not only like just what you would expect from editors of The Economist, but also pretty unimaginative.

“Creativity, Inc” by Ed Catmull

I wish this was more about the history of Pixar, and trying less to be a management book. (The management stuff is pretty good, and has quite a few things that were certainly readily applicable to where I work, but it’s not really why I was reading it.) It is a good insight into Pixar as a company, though, if you’re a fan of them (as I am!) both on a cinematic and technical level. There’s quite a bit about the troubled development history of some of their most successful films (e.g. Toy Story 2) and how they decide what to do with projects that are seen as failing.

I describes “Inside Out” as a work in progress, and on reading that I thought, “how’s that ever going to work?” How wrong I was :)

“Jeeves and the Wedding Bells” by Sebastian Faulks

I was a bit sceptical about this – a Jeeves and Wooster novel written by someone other than Wodehouse! – but it’s really joyous, particularly in ways that I can’t talk about for fear of spoiling it for you :)

“Delusions of Gender” by Cordelia Fine

A very good book debunking the nonsense talked about the difference between men’s and women’s brains.

“The Paying Guests” by Sarah Waters

I think Sarah Waters is just a brilliant writer – I’ve read all of her novels, and this latest one is great too.

“Geek Sublime” by Vikram Chandra

Unfortunately I didn’t get on with this at all. I care a lot about the aesthetics of programming, obviously, but didn’t really relate to what the author was saying.

“The Way Inn” by Will Wiles

An excellent second novel from Will Wiles (disclaimer: whom I know a bit from college). It’s funny, painful, and not quite what you might expect from early on in the book. (It’s added a certain something to staying in chain hotels for me, which you’ll understand if you read it.)

“The Blunders of Our Governments” by Anthony King and Ivor Crewe

This book is in two halves – the chapters in the first half are each on a blunder which a UK government has made in recent history (e.g. the Poll Tax, Individual Learning Accounts, etc.) and the second half has more general analysis and suggestions. I think the first half is brilliant, and very professionally relevant to me and a huge proportion of the people I know who work in civic tech or GDS, say. The second half is less convincing, particularly when it touches on IT projects. But the case studies in the first half are compelling, partly because you will frequently cringe  on hearing some of the mistakes that well-meaning people have made.

“Dodger” by Terry Pratchett

More Terry Pratchett, in Dickensian mode (Dickens even appears as a character). Very enjoyable, as you’d expect, and I found the mystery / thriller aspect exciting.

“Coraline” by Neil Gaiman

I read this partly out of embarrassment that I hadn’t before, but also because a friend told me they didn’t enjoy “The Ocean at the End of the Lane” because it was so similar to Coraline. I disagree, I think – both stand on their own merits. They’re both brilliant stories – I wish I’d they’d been around for me to read when I was much younger. (I watched the animated film of Coraline later, which is a very nice adaptation.)

“23 Things They Don’t Tell You About Capitalism” by Ha-Joon Chang

This title rather sets up the obvious line that this included quite a lot of things that I have been told about capitalism :) It’s interesting, though, and a quick read with some interesting examples, and tackles lots of broken assumptions people make about economics.

“A Slip of the Keyboard” by Terry Pratchett

This is a collection of Terry Pratchett’s non-fiction writing, which has some real gems in it – e.g. his anger when writing about the danger of extinction of orangutans is very powerful, and (very different in tone) an essay about how our Father Christmas has clearly been swapped with one from another universe is brilliant. I read this shortly after Terry died, so a lot of this felt very poignant.

“The Life-Changing Magic of Tidying” / “Spark Joy” by Marie Kondo

These books, which are ostensibly about tidying up, but in large part about getting rid of your possessions, have a huge following. I’m a bit conflicted about these because they are bonkers, but they’ve been genuinely useful to me in prompting me to get on with (a) recycling or selling on things that I don’t really need, based on her test of “does it spark joy when you hold it?” and (b) her techniques for folding and storing clothes.

(When I say it’s “bonkers”, I mean, for example,  the suggestion that you give little pep talks to screwdrivers and other unglamorous but necessary possessions; thinking about how your socks feel being bundled into a drawer; the assertion that clothes that are worn closer to your heart being easier to feel affinity with, etc. etc. Still, it’s interesting to me that despite these things, I think these books have still had a very positive influence on my life.)

“Hatchet Job” by Mark Kermode

As a dedicated follower of the church of Wittertainment, I was predisposed to like this book by Mark Kermode about the practice of film criticism and its place in the world, and it didn’t disappoint me.

Christmas Gift Ideas

Holiday gift guides seem to be almost exclusively terrible, particularly those aimed at “geeks”, whatever people think that means. In particular these lists are often split by gender, presumably because they’re written by (or pandering to) idiots. And they’re typically full of novelties or cute ideas that in practice will just occupy valuable space and you’d feel guilty about giving away or recycling. This post is my attempt to write a list of ideas which I think has (mostly) genuinely useful presents on it, based on things we’ve owned and used for a while.

I’m quite conflicted about this exercise, I should probably say. Different families and social groups have very different present-giving cultures, but for many people in a similarly lucky position to me, getting consumer items that you don’t really want, or a subtly wrong variant of something you do want, is worse than getting nothing at all, and much worse than the person giving money to a charity instead. That said, maybe these lists are useful as a basis for things people might suggest and discuss before giving as presents?

(There are quite a few Amazon affiliate links here, which I haven’t tried using in a blog post before. I imagine no one much will read or click on links in this post anyway, but if that bothers you, you’ve been warned, at least.)

Muji touchscreen gloves

You can get gloves with conductive material in the fingertips from loads of places nowadays, in fact. The idea is that you can use capacitive touch-screens, like those on your phone or tablet, without removing  your gloves. These Muji ones are pretty cheap, and work OK – I find you need a bit more pressure than without the gloves to get them to work, but it’s fine.

Non-contact infrared digital thermometer

These thermometers are brilliant for accurate and remote measurement of temperature. (This was a great recommendation from my colleague Paul.) I use mine quite a lot for things like cooking and checking the oven temperature, as well as measuring the surface temperature of my feet, how cold the walls are, etc. It has a LASER pointer as well to mark what you’re measuring the temperature of. I think this is very similar to the device used by Gale Boetticher in Breaking Bad when he’s making tea, if that means anything to you :)

Ear plugs for loud concerts and public transport

I’m sure that my hearing was somewhat damaged from gigs and nightclubs when I was younger; it always seemed to be particularly bad after going to small venues where the treble is far too powerful. (I wonder if this is because the sound engineers have damaged the higher ranges of their hearing over the course of their working lives and then compensate for that, damaging the customers’ hearing in the same way, and so on…)

To avoid further damage to my hearing nowadays, I always take ear plugs with a fairly flat frequency response along to gigs. They can never be ideal: you often get a huge amount of bass sound through bone conduction at loud gigs, and the ear plugs can’t do anything about that. It’s probably better to have rather too bassy sound than damage your hearing, though.

The ones I’ve linked to from the heading have switchable filters for different levels of sound attenuation – the ones I have don’t have that feature, but in retrospect it would have been nice if they did.

I carry these in my bag all the time, and it’s also frequently useful for blocking out sound on public transport as well, to give yourself some peace and quiet.

Raspberry Pi 3 Model B

I think it’s always possible to think of a fun new project that a Raspberry Pi would be useful for, and the Pi 3 is a big step forward from the previous models, having Wi-Fi and Bluetooth built-in.

I really like the lovely Pimoroni cases, like these, which I got for our Pi 3, and they do other nice accessories.

A good battery charger

NiMH rechargeable batteries are really good nowadays, and save you quite a bit of money if you have lots of devices that use AA / AAA batteries. We got this battery charger in part because I believe it is the same model, albeit with different branding, as the Wirecutter’s “Also Great” pick – it has more features than their basic suggestion. (There’s a useful FAQ for it in an Amazon review of the US version.) Although the UI isn’t very intuitive, you can use the device to calculate the capacity of your existing batteries, which is really helpful – when we first got it I went through all of our existing rechargeable batteries to work out which were worth keeping and which should be replaced.

“The Knot Shop Man” books

My friend and colleague Dave Whiteland wrote an amazing of series of books called “The Knot Shop Man”, which are described as being for “smart children or thoughtful adults”. The theme of knots runs through all the books (which you can read in any order) and they come bound up in a very special knot. (After finishing reading them, you should try to retie it. :)) Not enough people know about these books, and I think the ones he’s selling at the moment are limited in quantity.

A Network Attached Storage device

If you don’t have any Network Attached Storage, I think you’ll be surprised at how soon you come to rely on it, both for backups and for storing music and videos on to stream to your phone, TV, or whatever.

For me, one of the Synology 2 drive bay boxes seemed about right, and it’s been brilliant so far. (Another of my excellent colleagues, Zarino, wrote a blog post about the initial setup of these.)

An AeroPress coffee maker

This is our favourite way of making coffee, for one or possibly two people.

A cut-resistant glove

If you have one of those excellent Microplane graters or a mandoline and are as clumsy as me, you’re probably familiar with the experience of accidentally slicing your hands when using them. You can partially solve this problem with a cut-resistant glove. (I say “partially” because this then shifts the problem to “remembering to use the cut-resistant glove”.)

A Mu folding-plug charger

These are a lovely design, which makes a UK 3-pin plug take up less space by letting you fold it away, and provides 2.4A USB port.

Butchery course at the Ginger Pig

Presents that are experiences rather than things-that-take-up-physical-space often work out well. An interesting one of these we did is a course in butchery at the Ginger Pig – you can do a class on pork, beef or lamb and you get both a big meal and a lot of meat to take away, as well as learning about what each cut of meat is good for.

Bluetooth keyboard

I really try to avoid using Bluetooth, because, well, it’s terrible, and gives me flashbacks to the worst job I ever had, working on “personal area networking”. But this keyboard has actually been pretty good – you can have it paired to three different devices and swap between them easily. (It doesn’t seem to be easy to find a keyboard that can be paired with more than three devices, but maybe someone knows of one.)

Bose noise-cancelling headphones

(I have an earlier version of these, but the QC25 seems to be the current equivalent.) I gather that headphone connoisseurs don’t particularly rate the sound quality of these, but basically everyone agrees that the noise cancelling is amazing. (The sounds seems great to me, for what it’s worth, but I’m not an audiophile.) For long coach, train and plane trips they’re fabulous, if quite bulky to take with you. They are expensive, though.

Sliding webcam covers

This is a set of 5 little sliding webcam covers. The idea is that if someone malicious gains remote access to your computer, then the impact if they can also see everything from your webcam is much worse than if they can’t. These little windows are really cheap and mean that you can just open the webcam window when you actually need it.

A bumper case and screen protector for your phone

I think I probably drop my phone about once a day, but haven’t had smashed its screen yet, thanks to having a good bumper case and screen protector. The Ringke Fusion line of bumper cases (which they seem to do for most current phones) are the ones I’ve used for a while, and as a very clumsy person I can testify that they protect your smartphone very well.

As a screen protector I’m currently using one of the “Invisible Defender Glass” models.

(occasional miscellanea)