Tuesday, 12 May 2015

The 2015 General Election: London Results

The perennially excellent London DataStore has published comprehensive, accessible and usable data on the 2015 General Election. So, naturally, I had to make some maps of it. There probably isn't anything about this election that hasn't been mapped but since my last blog was a General Election piece I thought I'd do a little follow-up, with not a hexagon in sight. There have been quite a few 'who came second' maps but not many which include third and fourth places. I'm particularly interested in London because it's something of an exception in the South East of England and, well, I just wanted to make some maps. Below you'll see maps for who came first, second, third and fourth. The last map has the constituency names. I resisted the temptation to do a 'who came eleventh' map, but, since you asked, there were only four constituencies where there were at least 11 candidates, and the parties included the Whig Party. They might have been quite prominent had these maps been made in 1830 (incidentally, there were 658 constituencies in the UK then, which only had 24 million people).

'That's Blockbusters' - for Labour

UKIP emerge and Tories dominate second place

UKIP by far the most in third place - Greens emerge

No place for three 'main' parties here

Just in case you don't know all constituencies off by heart

You should be able to see pretty big versions of the maps if you click on them and then open them in a new tab/window. I've dispensed with the usual boring map legend and instead turned it into a 'bargend' (a portmanteau I just invented). I hope you find these interesting. 

Final nugget: the Whig Party came 9th in Bethnal Green and Bow (their best result) with 203 votes for my namesake, Alasdair Henderson.

Thursday, 7 May 2015

Can Google search data predict an election victory?

Today seems like a good day to write a little blog post on what search data can and cannot tell us. Why? Because of the story below, which has been on the front page of the MailOnline for much of the day. This is just the way news works, but I thought it would be useful to give a bit more information here. The story behind the story goes something like this...

Simon Rogers, datajournalist and Data Editor at Google in San Francisco, got in touch some time in April to ask if I could help him map party leader search patterns by constituency. I'd been doing a lot of work with search data for housing markets anyway so this seemed like an interesting idea. We took search data for all points that were geocoded (there were about 5,000 total across the UK) and then produced a constituency version for all 650 seats. The final constituency results matched very closely the proportions for the individual places. The data are for the previous 12 months. 

MailOnline front page 07/05/15

The big question is what this all means. Do I think, as the MailOnline suggest, that 'Google Search tips Cameron to win election'? No. Do I think it disproves it? No. Do I think the large volume of search for Nigel Farage indicates his level of popularity across the country? Again, no. However, it could indicate that people are more likely to show interest in UKIP in an environment when nobody else is watching or listening. But we don't know. Does this prove that Miliband will come third? Definitely not. The map merely indicates who was the most searched for party leader in each constituency. The intent and sentiments of individual users are not known. In my own research in housing market analysis I've tackled this by doing interviews with website users but since this data is from Google they could of course add in other terms which people might combine with party leader names (some more favourable than others!). 

Kate Newton from Bing also got in touch to say they had worked on something similar (though much more sophisticated) in relation to the Scottish Independence referendum last year. More widely, there is a body of emerging research (including my own) which looks at search patterns and subsequent activity - mostly in the field of economics. The results suggest that search can be analysed meaningfully to predict future activity. But that's not what the party leader piece was about - not from my perspective. Thankfully, other media outlets were more measured in their analysis - such as BuzzFeed UK, The Scotsman,  and The Telegraph

MailOnline story 07/05/15

What's most interesting to me? Well, I'm most interested to see how the search patterns relate to outcomes in key marginals. I suspect there will not be much of a pattern but if there is it will be interesting to attempt to take this little piece of work further - perhaps for the US 2016 election. Other than that, this is an interesting stocking filler on a day when the papers and TV crews are forbidden from reporting anything really substantial until the polls close at 10am. In the meantime, my favourite snippets from the search map...

David Cameron is the most searched for leader in his own constituency, but he's surrounded by a sea of purple, plus a blob of red and orange.

Witney - David Cameron's constituency

Perhaps a little predictably, there is a lot of search for UKIP in Kent but - strangely - it appears that in the constituency where Nigel Farage is standing (Thanet South) the party leader most searched for is Ed Miliband.

UKIP - lots of search in Kent, but not to much in Thanet South

The search results produce some interesting results. The image below is a good example. Natalie Bennett (Green Party) is the most searched for leader in Durham North West and next door in Durham North the leader (by search over the past 12 months) is Leanne Wood (Plaid Cymru). I suspect this was down to a localised spike in interest after the leaders' debates.

Durham - Green and Welsh Nationalist stronghold?

Other interesting nuggets to emerge were the way in which geographical patterns sometimes reflected the opposite of what you'd expect. The most obvious example was where Nicola Sturgeon (SNP and not standing in this election) was the most searched for party leader in several English constituencies, such as Chesterfield (below). Her excellent performance in the leaders' debates probably led to a spike in interest. Perhaps the SNP ought to consider putting up candidates in England too.

The SNP take Chesterfield? Not so fast.

A kind of similar situation to the SNP/Chesterfield example can be seen in the final image below, where Nigel Farage is the most searched for leader in Aberdeen North. This Scottish constituency has no UKIP candidate and, even if it did, they would be a long way away from the top party.

What's my prediction for the outcome of the election? The only prediction I'll make is that the results will look nothing like this map!

Sunday, 1 March 2015

Static maps from CartoDB

I've used CartoDB quite a bit recently to create interactive maps. My Historic Buildings of Scotland map and my England Grade I Listed Buildings map have both proved popular. Another map that received lots of traffic was my English Greenbelt map. This one in particular was popular I think because it allowed people to find out exactly where the green belt was near them - or in fact to find out which areas aren't greenbelt. Although interactive maps are great for some things, sometimes we just want to put a static image in a report or on a website and this can often be a bit tricky with interactive content. Luckily, CartoDB allows users to export static image files (png format), and you can customise the dimensions as well. This CartoDB blog post shows you how - though you may need to disable pop-ups in your browser if it doesn't work. This tool opens up the static map in a new browser tab so if nothing happens check your pop-up blocker settings.

View the full size version

As you can see above, I've exported my English Greenbelt map. I often get asked if I have a large static image of it because if you search online there aren't really any large up to date, decent resolution, detailed green belt images. I also added some attribution information (important!) and a basic title. This is just a quick example so isn't perfect in relation to labels and so on but it gives a simple overview of the green belt.

Sunday, 25 January 2015

How many people live in Tokyo?

Back in August 2014 when I was preparing some material for teaching students how to query data in a GIS, I devised a very simple example where they had to select all the countries in the world with a lower population than the Tokyo Metropolitan Area (or Greater Tokyo Area, as it's sometimes called). I did this just as an example but since I found the results quite interesting, I quickly turned it into a map and posted it on Twitter, complete with 'Toyko' typo in the subheading. It was really just a quick example of how to query data in a GIS but it also highlighted the massive population of the Tokyo metropolitan area. You can find the full size, original image here. I was prompted to write this blog because the map was reposted on the Canadian Twitter feed @AsapSCIENCE a few days ago and since then my inbox has been a bit busy.

Now, back to the original question of how many people live in 'Tokyo'. Well, when I say 'Tokyo' in the map, I'm referring to the wider metropolitan area, which in 2014 the United Nations World Urbanization Prospects said has a total population of 37,833,000, far beyond the next largest urban agglomeration, Delhi, at 24,953,000. By way of comparison, using the urban agglomeration population (rather than a city's administrative boundaries), Toronto had 5.9 million, London had 10.2 million and Beijing had 19.5 million. Clearly, these definitions include other urban places that are not Tokyo (e.g. Yokohama) but they are recognized as being part of a fairly coherent metropolitan area. 

So, in the map, I'm using a figure of 'approx 36 million' as I say but in reality the UN figure is a bit higher. If we want to narrow the definition down to just the inner urban area then obviously the figure reduces significantly. I'm not usually one to cite Wikipedia, but in this case it's a good place to go to learn about the various definitions of the Greater Tokyo Area/Kantō region. If you don't want to click, here's a summary of some Tokyo populations...
  • Former city area (23 'special wards') - 8.95 million people
  • Tokyo Metropolis - 13.05 million
  • Tokyo Metropolitan Employment Area - 31.70 million
  • National Capital Region - 43.47 million
If we take the metropolitan definition used by the United Nations, then Tokyo does indeed have more people than Canada, at 35.5 million (2014 Statistics Canada estimate) and far more than Australia at 23.7 million (see their population clock). The Tokyo metropolitan population is roughly the same as the whole of California, which currently stands at about 38 million.

Link to chart

This was all just a little bit of map trivia and whilst it seems to have annoyed some people who live in a red country, the point was just to demonstrate the simple analytical power of GIS in addition to the size of Tokyo (to make it more interesting). The data I used are from Natural Earth if you want to have a look yourself and the software I used is a free GIS called QGIS, which is really good. Some other random facts about what happened to my original tweet...

Tokyo metropolitan area

Saturday, 17 January 2015

Interpreting political maps

I recently tweeted a couple of maps showing the 100 most and least deprived constituencies in England. I used the 2010 English Indices of Deprivation to calculate this, aggregating the data from smaller areas to parliamentary constituencies. The method is not perfect but on the whole the areas identified are either among the poorest or richest in England. There are 533 constituencies in England so the figure of 100 is roughly the 20% most and least deprived (18.76% to be more precise). I shaded the maps using red for Labour, blue for Conservative, Yellow for Liberal Democrats and so on. The most obvious thing about the maps is, of course, the fact that the most deprived map shows nearly all Labour constituencies and the least deprived shows almost all Conservative constituencies. Click the caption below the images to see interactive versions.

100 most deprived constituencies

100 least deprived constituencies

These kinds of maps often make a big impact and are shared widely but people tend to draw conclusions from the patterns they see that are not necessarily correct - and often conclusion which mirror pre-existing biases and perspectives. For example, some people see these maps and claim that voting Labour makes you poor or that only rich people vote Conservative. Some would even claim that this proves that Labour has failed the constituencies they serve. Opponents would argue that coalition cuts have merely deepened spatial inequalities and hit Labour-voting areas hardest. This is all a bit dramatic, but you don't have to search online long to find such views.

Other people might say that if you want to be richer you should vote Conservative. Other people would tell you not to be so simplistic and point to the way in which voting patterns are formed at the local level. Still others might point to the longstanding economic differences between north and south in England and say that this has something to do with it. Perhaps others will say that the Conservatives are the party of the rich and that Labour are the party for the poor. There are varying degrees of truth in all these views but the point I want to make here is that none of this can be proven just by looking at a political map.

For me, such maps are a starting point for a conversation about what these patterns might mean, whether they are a problem and what might be done about it, if anything. I'm not making these maps because I'm pro-Labour or pro-Conservative or because I think that they prove anything in particular but because I want to draw attention to the patterns and what they might mean. Finally, some observations from the maps...

  • There are no Labour constituencies amongst the 100 least deprived in England.
  • There are 2 Conservative constituencies amongst the 100 most deprived in England.
  • Sheffield Hallam (Liberal Democrat, Nick Clegg) is amongst the 100 least deprived constituencies in England. 
  • There are 5 Liberal Democrat constituencies amongst the 100 most deprived in England.
  • There are 7 Liberal Democrat constituencies amongst the 100 least deprived in England.

It will be very interesting to see how these patterns change (if at all) after the General Election this year.

Wednesday, 14 January 2015

Visualising Residential Mobility in Urban England

Last year I produced a few commuting maps of England and Wales after the 2011 Census data were released. Now I've turned my attention to mapping patterns of residential mobility in urban areas of England as part of my work on understanding housing markets. This post highlights some of the patterns uncovered in the data - which are output area level migration flows for England and Wales (about 4 million individual flows). If you're interested in how I did this you can find out in a previous post. The first image is of the urban North West of England and for subsequent images I've zoomed to different parts of the country. I've kept it simple and only showed the flow lines, apart from in the North West where I've also added some place labels. It's all a bit experimental at this stage.

You can find a higher resolution image here

The North East of England

West, East and South Yorkshire

I think some places are missing (working on it)

I've adjusted the brightness a little to make this clearer

What does all this show? It shows what many people may already know or expect but basically it illustrates the extent of residential mobility patterns in some of England's major urban areas - plus a bit more in the South West example above. There's a lot more that could be said about this but for now I'll leave it at that. I'm sorry if your town or city isn't on the map! Maybe next time...

Notes: I've filtered the data so in certain cases some places are not shown (e.g. in the North West image places in North Wales are not visible). Also, I've only shown flows of a certain volume in order to filter out the noise. 

Tuesday, 23 December 2014

Simple Animations with QGIS - A Long Tutorial

This rather long post explains how you can make images and animations like the ones below using only open source software (mostly QGIS) and open data. I've just used flights from Hartsfield-Jackson International Airport in Atlanta because it's the busiest airport in the world and serves a wide variety of destinations. The methods are relatively straightforward but it does take time to get your head around if you're new to the software and the data. If you're already a competent QGIS user it should be pretty easy. Once you've got the method nailed down you can apply it to all kinds of different scenarios and datasets. First of all, though, here's a static image of outbound flights from ATL projected onto a globe.

Flight destinations from Atlanta (full size)

The first thing you'll need to do is get some data. In this example I've taken some general country boundaries from Natural Earth and for the flights data I've used the OpenFlights dataset. I then created an azimuthal orthographic projection centred on Hartsfield-Jackson International Airport in Atlanta. Hamish Campbell already wrote an excellent tutorial on how to do this in QGIS, so just follow that if you want to use a projection that makes your country layer appear as if it were on a globe. The only extra tips you'll perhaps need to replicate Hamish's method are as follows. To get the lat/long of a place you want to centre your projection on, just search for it on Google Maps and then look in the address bar for the coordinates. The python script on Hamish's page just needs to be copied and pasted into a text document, saved with a .py extension and then placed in the correct folder on your computer (on a PC with QGIS 2.4 it would be something like this: C:\Program Files\QGIS Chugiak\bin). Also, when you clip the layer to a global projection, QGIS will create a clip circle and a new clipped layer. These may not appear at all or if they do they'll maybe be very blocky. If so, go to the properties for the layer in QGIS and on the Rendering tab just untick 'Simplify Geometries'. I normally save new copies of these layers using Save As... from each layer.

A faster animated version of ATL outbound flights (big)

That should be your global projection sorted. You can then apply it via the Project menu in QGIS and then Project Properties... CRS and then 'Project on the fly' as in the tutorial.  To achieve some of the visual effects above, I just duplicated the clipped circle layer (you'll have to Save As... from the temp clipped layer to do this) and applied an inverted polygon style and a shapeburst fill with a blue to black gradient (Nyall Dawson did a great blog post on this, which you might find useful). I also did a similar thing with the land layer, just to make some of the smaller islands stand out. You'll also need to make the outline colour the same as the fill colour in the circle to avoid a line appearing through your earth.

A very slow version, with a pause at the end (big)

So far, so good. But what about the flight paths and animated dots? Well, to create the lines you can follow my blog post on flow mapping in QGIS and use the sample dataset I posted there. You'll need to calculate two new columns for this shapefile (see below) and use the MMQGIS plugin for QGIS (installed, as ever, via the Plugins menu). If you just add this file to your global azimuthal orthographic projection there will be so many lines and it may take a long time to display so there are a few intermediate steps I'd recommend... 

1. Open the new global flights dataset in a blank QGIS project using the default projection and then remove duplicate lines using Modify, Delete Duplicate Geometries in the MMQGIS plugin. Many routes (e.g. JFK-LHR) are served by multiple airlines and I wanted to only show origins and destinations. This also makes the file much smaller. 2. Although the flight connections would appear as straight lines on our global projection, I like them to look a little curved; partly for effect and partly to bring out the curvature of the earth but also because flight paths are not straight lines in reality. So, once I've removed duplicates I then 'densify' the lines in QGIS by adding in 50 intermediate vertices - done via Vector, Geometry Tools, Densify polygons in QGIS. 3. I then added this new flights layer to my ATL-based global projection and I clipped the layer using the Hamish Campbell method, and saving the resulting layer as a new shapefile. You should now have a globe centred on the location of your choice, plus some nice curved airline flight paths.

Same as above, but with labels and pause at end (big)

For the next stage, the way I did it was to open the dbf part of the new flight paths shapefile in Excel and then calculate an 'offset' lat and long column which I could use to animate the dots. You just need to read the Animate Columns part of Michael Minn's MMQGIS page to understand this. Once you've calculated the new lat/long offset columns you can save the csv. Once you've done this, import the csv into QGIS using the Add Delimited Text Layer (comma icon), using the airport origin lat/long as the x,y coordinates in the import dialogue. Filter the new layer so it only shows ATL origins and you'll just see one dot for ATL but actually there are many dots in the same location as they all have the same origin lat/long. Save the filtered layer as a new shapefile and then run the Animate Columns tool in MMQGIS using the appropriate fields and the number of animation frames you want (50 works well with this example). An important point here is that you need to make sure your QGIS window is quite small as the extracted image frames will be the same size as your QGIS map frame and if it's too big it will make a massive GIF.

You now have all you need to create an animation. There are many ways to create an animated GIF, but using GIMP is very simple. You can download this free, open source image manipulation programme in a few minutes. You then just need to go to File, Open as Layers and then select all the frames you just created in QGIS and GIMP will add them to the project and they'll appear in the Layers pane. You don't need to reorder them as they are numbered correctly from the MMQGIS export. From here you can go to File, Export and then select the GIF file format and use the animation options here. Try 50 milliseconds between frames as with 50 frames this will create a nice short 2.5 second animation that isn't too slow. You should use Filters, Animation, Optimize for GIF and then export from that window if you want a much smaller file size. I created another one of these visuals, centred on LHR and showing flows from JFK, LHR and PEK.

Same techniques, different data (and also a bit crazy)

That's quite a lot of information and quite a few steps but if you try this and still can't make it work feel free to get in touch via twitter or e-mail. Why would you want to do this? I'll leave that up to anyone who wants to try it but displaying movements of people and goods is relevant across a number of disciplines so hopefully some will find this useful.

Other tips and information: depending upon which location you're choosing, some of your lines or dots might be going the wrong way round the earth but you can fix this with a bit of simple maths in the offset calculation. In GIMP, you can add a different frame duration by adding a number and then ms to the layer info - e.g. 1000ms - so that it creates a pause effect, as in the examples above. I created an ATL point and a destination airports points layer from the imported csv so that I could manually create a couple of extra frames to add in to the end of my animation. One to show destinatinon airport names and the other just to label ATL. For the glow effect in the flow lines in the static image I used the Feature Blend 'addition' option in layer properties in QGIS. 

Acknowledgements: As ever, I've borrowed ideas and techniques from other QGIS users, including Hamish Campbell, Nyall Dawson, Nathan Woodrow. I decided to have a look at this after an e-mail exchange with Waldo Tobler about migration data. Thanks of course to the excellent OpenFlights team who make their data available under an Open Database Licence.