Showing posts with label 2015. Show all posts
Showing posts with label 2015. Show all posts

Monday, 9 November 2015

Premier league poverty, 2015

Over the past decade I've spent a lot of time looking at patterns of deprivation across the UK. One thing I've often noticed is the way football grounds regularly appear in the very poorest neighbourhoods. I've blogged on this topic a few times in the past, most notably in 2012 when I looked at the location of English Premier League grounds in relation to the deprivation level of their areas. I also noticed this in relation to the Scottish Index of Multiple Deprivation when looking at the East End of Glasgow in 2009. Given the history of football, its industrial working class origins, the development of British cities, and land values (to name just a few factors), none of this should be a surprise. But since a new English deprivation dataset was released in September, I wanted to revisit the topic and make a few maps, just to see if anything has changed. That's what I've done here - one map showing the location of each Premier League ground and the deprivation level of the area it sits in - and its wider neighbourhood. Further explanation follows below.

Note the blue area to the north east - now Highbury Square

Most areas in the neighbourhood are in the 20% most deprived

Bournemouth is the one big exception - it's at the opposite end of the scale

Stamford Bridge is in a much more mixed area than most

Selhurst Park sits right beside some more deprived areas

Goodison and Anfield look very similar  - only about half a mile apart

The wider area of Leicester's ground is more mixed

This is quite typical of much of north Liverpool

Manchester City play in the most deprived area of any top flight team

Manchester United are situated in a more mixed land use area

Newcastle United's pitch is split between areas - I've based this on majority area

Norwich City's ground is also in a more mixed neighbourhood

St Mary's is situated in one of the city's most deprived zones

Stoke also play in quite a deprived area - though there's more variation nearby

Like many newer stadiums, this ground is in a slightly more mixed area

The Welsh deprivation dataset is used here - but similar story to be told

Post-riots, much has been said of regeneration in this area of London

Watford play in a much less deprived locality

West Brom's ground is in the most deprived decile of England

Another ground split between areas - but still more deprived than not

What does all this tell us?
The most obvious thing to emerge from this simple mapping exercise is that more than half of all Premier League grounds are located in areas among the 20% most deprived in the country, but a good few are not. Two in particular - Bournemouth and Watford - are in much less deprived areas. Nonetheless, if you scroll through the maps quickly, the main colour you'll see is red (for the 20% most deprived). When I see stories in the news about the ability of sport to tackle deprivation, I'm generally all for it, but then sometimes I make a mental comparison between the wage bills of some teams and the neighbourhoods they're located in and I think we've barely scratched the surface of what's possible when we talk of the potential for elite sport to help transform poorer areas. Post-Olympics, this has kind of been forgotten. Having said this, it is good to see that the Premier League and FA's Football Foundation provides money for grass roots development in the most deprived areas as defined by this very same dataset. 


What does it not tell us?
Quite a lot, and I wouldn't want anyone to think that I've done this to pick on any one team. I'm just curious about the relationship between these football grounds and underlying patterns of deprivation because when I look at the data as I map it, I often notice the stadia. It doesn't tell us anything about cause and effect, whether teams are trying to do anything to boost the fortunes of their local areas or what the areas themselves are like to live in. If you want to know more about the underlying data, read this briefing from the Government. Does having a Premier League football team in your area make you poor? Of course not. 


Some of the grounds look the wrong shape - why?
I used building footprint data from the Ordnance Survey in the maps above and the shapes of the grounds are as they were in the original dataset - with the exception of Vicarage Road, which for some reason wasn't enclosed on one side so I made my own version. I've just added a little glow around each ground to make it stand out and then added in the footprints of all other buildings in the wider neighbourhood to help people identify nearby features and roads.


What about when a ground is split between areas?
I could have taken the average deprivation rank here and used that figure but instead I chose to use the deprivation rank of the area that the majority of the playing surface was located in. This was only really an issue for Arsenal, Newcastle and West Ham - and only really notable in Newcastle. 


Explain that 'deprivation percentile' thing again please
In England, there are 32,844 areas known as Lower Super Output Areas. These LSOAs are small areas which the government use to report all kinds of statistics, including Census data. When they publish their Indices of Deprivation, they give each one of the 32,844 areas a rank, from 1 (most deprived in England) to 32,844 (least deprived in England). Therefore, it's a relative measure that allows us to compare one area with another, all across the country. The data are often split into five or ten chunks (quintiles or deciles) for reporting purposes but here I've decided to use 'percentiles' as it's more precise. If an area is in percentile 5, it's among the 5% most deprived in England, and so on. If it's in percentile 95 (like Bournemouth's ground) then we can say it's not very deprived at all and actually highly likely to be a very affluent area. In the case of Swansea City, I've used Welsh deprivation data from 2014. This classifies places in almost exactly the same way, although there are 1,909 areas in Wales rather than 32,844. These areas have an average population of around 1,600.


Isn't this all just pointless area classification?
You might think so, but the Government use these Indices to make all sorts of important decisions, in healthcare and education for example. If you're in an area classified as being among the 20% most deprived, for example, you might find that you're eligible for some kind of funding - there are loads of examples of uses, with sport being one of many. You can find quite a few other examples in section 1.4 (p. 8) of this report. We must also remember that not all people living in areas classified as 'deprived' or 'not deprived' match that description - this dataset classifies areas not people.


When are you going to expand this to include my team?
I'm not planning to, but I'm sure it would be even more interesting than the Premier League.


Curiosities
On all the maps, north is up so I couldn't help notice that Manchester United seem to be the only team playing on an east to west pitch. I'm guessing most grounds don't do this so that they can avoid the setting sun problem - and in fact Old Trafford cricket ground rotated their pitch 90 degrees to avoid this problem in 2010. Shades of blue - representing the 40% least deprived areas - appear on only 7 of the maps, and only two grounds are in such areas. Red (20% most deprived areas) appear on 19 maps - only Watford is the exception. The maps for Everton, Man City, Tottenham and West Brom are entirely red - which indicates that these grounds and surrounding areas (a few hundred metres in each direction) are within wider areas classified as the most deprived in England. The very most deprived areas to appear on any of the maps are ranked 24 (beside Goodison) and 29 (beside Anfield). 


Which team do you support?
ICTFC, of course. But not very enthusiastically. 




Thursday, 1 October 2015

Are map legends too lazy?

A somewhat click-baity blog title, but I wanted to crowdsource some knowledge from proper carto/viz people, so if you have any insights on what I write, please feel free to get in touch via twitter or e-mail. No doubt what I write about below already has a name but I don't know what that is and I haven't seen this functionality in proprietary or open source GIS. By asking 'are map legends too lazy', what I really mean is are GIS-made choropleth map legends doing enough for us in their current form - and is there an opportunity for us to add some new functionality which enhances the communicative power of the humble choropleth legend? An example... look at the map below, which I created in QGIS. It's a map of a new deprivation* dataset for England, focused on the local authority of Birmingham.

Deprivation choropleth, with legend and inset map

This dataset is typically understood and discussed in terms of deciles, hence the classification used above. The dataset goes from decile 1 (most deprived) to decile 10 (least deprived) - within the context of England as a whole. Cities like Birmingham tend to have a higher proportion of their small areas in the most deprived decile, and in map form this results in lots of red and not much blue, as you can see above. If you wanted to find out how many areas were in decile 1 (most deprived) you would know that it was 'a lot' but because the inner-urban areas tend to be smaller in size (relative to the blue ones), making an accurate assessment visually is quite difficult. In fact, owing to the different sizes of the spatial units, you could quite easily take the wrong message away from a choropleth like this.

My solution? Make the legend do more work. Make it tell us not just what the colours represent but also what proportion of areas are in each category by scaling the colour patches relative to the proportion of areas in each choropleth class - in the form of a bar chart - what I call a 'bargend' (jump in at this point if you already have a name for this). You could, without much effort, add in a table or a separate chart, but I want the legend to actually be the bar chart. In part, I was inspired to attempt this in QGIS because of Andy Tice's prototype scatterplot layout and his comment that he'd like to get it working in the QGIS Atlas tool. Here are some results, followed by further thoughts.

This time, I've added in a 'bargend'

A closer look at the bargend for Birmingham

When I do a visual comparison of the Birmingham map, I'm surprised that the least deprived (i.e. richest) areas only account for 1.7% of the total, because I'm drawn to the blue of the choropleth. This could be solved though a cartogram approach, but I wanted to preserve geographical accuracy here. I'm not surprised that almost 40% of areas are in the poorest decile - that's what I'd expect from what I know about deprivation in English inner-cities. Let's look at another example below.

The London Borough of Tower Hamlets

This time I've shown one of the poorest parts of London - Tower Hamlets. An interesting aside here is the emergence of one area in decile 9 (i.e. richer area) compared to the pattern from 2010. This is almost certainly linked to gentrification and displacement rather than individuals becoming 'less deprived'. I find the extra information provided in the bargend very useful analytically/cognitively compared to the simple legend we would normally use.

Now let's look at a few more...

Liverpool contains relatively few 'non-deprived' areas

Like Liverpool, Manchester has many poor areas

Middlesbrough has the highest % in the most deprived decile

One of the benefits of this approach, in my view, is when you compare different places - you can click on an image above and then go forward and backward to make comparisons. The added value of the bargend approach means that you have precise details of the proportion of areas in each decile and you can make more meaningful comparisons. You could just do this with a table or chart and dispense with the map altogether, but then you'd lose the very important ability to identify where precisely individual areas are and where spatial concentrations of deprivation (and affluence) exist. Talking of affluence, it's only fair that I show you some maps of places that are at the opposite end of the scale. Two prime examples...

A beautiful part of the world, but very blue

Hart, you almost broke my chart (highest % in decile 10)

I'll wrap up with a few points.

1. I'd love it if someone could find a way to add in this functionality natively in QGIS. I had to do a bit of thinking and tinkering to automate this in the Atlas tool, but I now have it working well and everything dynamically updates and re-positions itself once you set it up.

2. I wouldn't always want to use a bargend, but I think it's something that adds value without taking up much more space (if any) in map layouts.

3. I'm trying to think of any drawbacks of this approach, but I can't. I'm happy for others to chip in with ideas on this.

4. I think 'bargend' is a terrible word. Please tell me it already has a nice sounding name, or invent one for me. [update: in my rush to coin a phrase, and because I was mapping deciles as categories - as in a bar chart - I was thinking about bar charts rather than histograms. This is really a histogram but it uses named categories (deciles) which in theory could be re-ordered and the chart would still make sense, so perhaps the bargend retains qualities of both and, anyway, a histogram still uses bars]

5. Are map legends too lazy? Not really, but they can sometimes work harder.


Addendum
Andrew Wheeler very kindly got in touch to share a few relevant papers on the subject. The Kumar paper is very close to what I propose (though he does the chart for the entire dataset rather than a subset) and he calls it a 'Frequency Histogram Legend' - more accurate perhaps, but less catchy. The Dykes et al. paper is very interesting and I like the treemap approach.

Hannes (@cartocalypse) also got in touch to say he likes the idea and he's suggested 'legumns', which is also useful (but more difficult to pronounce!).

I'll add more on the topic if people respond.

* Just in case the use of this word sounds odd to you, we use the word 'deprivation' in the British context in studies of urban poverty/disadvantage but it's not exactly the same thing. I've written about this in previous academic papers but to all intents and purposes more deprived means 'poorer' and less deprived means 'richer'. In the maps above, you could say red: poor and blue: rich and you wouldn't be wrong (ecological fallacies notwithstanding).

Tuesday, 12 May 2015

The 2015 General Election: London Results

The perennially excellent London DataStore has published comprehensive, accessible and usable data on the 2015 General Election. So, naturally, I had to make some maps of it. There probably isn't anything about this election that hasn't been mapped but since my last blog was a General Election piece I thought I'd do a little follow-up, with not a hexagon in sight. There have been quite a few 'who came second' maps but not many which include third and fourth places. I'm particularly interested in London because it's something of an exception in the South East of England and, well, I just wanted to make some maps. Below you'll see maps for who came first, second, third and fourth. The last map has the constituency names. I resisted the temptation to do a 'who came eleventh' map, but, since you asked, there were only four constituencies where there were at least 11 candidates, and the parties included the Whig Party. They might have been quite prominent had these maps been made in 1830 (incidentally, there were 658 constituencies in the UK then, which only had 24 million people).

'That's Blockbusters' - for Labour


UKIP emerge and Tories dominate second place


UKIP by far the most in third place - Greens emerge


No place for three 'main' parties here


Just in case you don't know all constituencies off by heart


You should be able to see pretty big versions of the maps if you click on them and then open them in a new tab/window. I've dispensed with the usual boring map legend and instead turned it into a 'bargend' (a portmanteau I just invented). I hope you find these interesting. 

Final nugget: the Whig Party came 9th in Bethnal Green and Bow (their best result) with 203 votes for my namesake, Alasdair Henderson.




Thursday, 7 May 2015

Can Google search data predict an election victory?

Today seems like a good day to write a little blog post on what search data can and cannot tell us. Why? Because of the story below, which has been on the front page of the MailOnline for much of the day. This is just the way news works, but I thought it would be useful to give a bit more information here. The story behind the story goes something like this...

Simon Rogers, datajournalist and Data Editor at Google in San Francisco, got in touch some time in April to ask if I could help him map party leader search patterns by constituency. I'd been doing a lot of work with search data for housing markets anyway so this seemed like an interesting idea. We took search data for all points that were geocoded (there were about 5,000 total across the UK) and then produced a constituency version for all 650 seats. The final constituency results matched very closely the proportions for the individual places. The data are for the previous 12 months. 

MailOnline front page 07/05/15

The big question is what this all means. Do I think, as the MailOnline suggest, that 'Google Search tips Cameron to win election'? No. Do I think it disproves it? No. Do I think the large volume of search for Nigel Farage indicates his level of popularity across the country? Again, no. However, it could indicate that people are more likely to show interest in UKIP in an environment when nobody else is watching or listening. But we don't know. Does this prove that Miliband will come third? Definitely not. The map merely indicates who was the most searched for party leader in each constituency. The intent and sentiments of individual users are not known. In my own research in housing market analysis I've tackled this by doing interviews with website users but since this data is from Google they could of course add in other terms which people might combine with party leader names (some more favourable than others!). 

Kate Newton from Bing also got in touch to say they had worked on something similar (though much more sophisticated) in relation to the Scottish Independence referendum last year. More widely, there is a body of emerging research (including my own) which looks at search patterns and subsequent activity - mostly in the field of economics. The results suggest that search can be analysed meaningfully to predict future activity. But that's not what the party leader piece was about - not from my perspective. Thankfully, other media outlets were more measured in their analysis - such as BuzzFeed UK, The Scotsman,  and The Telegraph

MailOnline story 07/05/15


What's most interesting to me? Well, I'm most interested to see how the search patterns relate to outcomes in key marginals. I suspect there will not be much of a pattern but if there is it will be interesting to attempt to take this little piece of work further - perhaps for the US 2016 election. Other than that, this is an interesting stocking filler on a day when the papers and TV crews are forbidden from reporting anything really substantial until the polls close at 10am. In the meantime, my favourite snippets from the search map...

David Cameron is the most searched for leader in his own constituency, but he's surrounded by a sea of purple, plus a blob of red and orange.

Witney - David Cameron's constituency

Perhaps a little predictably, there is a lot of search for UKIP in Kent but - strangely - it appears that in the constituency where Nigel Farage is standing (Thanet South) the party leader most searched for is Ed Miliband.

UKIP - lots of search in Kent, but not to much in Thanet South

The search results produce some interesting results. The image below is a good example. Natalie Bennett (Green Party) is the most searched for leader in Durham North West and next door in Durham North the leader (by search over the past 12 months) is Leanne Wood (Plaid Cymru). I suspect this was down to a localised spike in interest after the leaders' debates.

Durham - Green and Welsh Nationalist stronghold?

Other interesting nuggets to emerge were the way in which geographical patterns sometimes reflected the opposite of what you'd expect. The most obvious example was where Nicola Sturgeon (SNP and not standing in this election) was the most searched for party leader in several English constituencies, such as Chesterfield (below). Her excellent performance in the leaders' debates probably led to a spike in interest. Perhaps the SNP ought to consider putting up candidates in England too.

The SNP take Chesterfield? Not so fast.

A kind of similar situation to the SNP/Chesterfield example can be seen in the final image below, where Nigel Farage is the most searched for leader in Aberdeen North. This Scottish constituency has no UKIP candidate and, even if it did, they would be a long way away from the top party.




What's my prediction for the outcome of the election? The only prediction I'll make is that the results will look nothing like this map!