Thursday, 1 October 2015

Are map legends too lazy?

A somewhat click-baity blog title, but I wanted to crowdsource some knowledge from proper carto/viz people, so if you have any insights on what I write, please feel free to get in touch via twitter or e-mail. No doubt what I write about below already has a name but I don't know what that is and I haven't seen this functionality in proprietary or open source GIS. By asking 'are map legends too lazy', what I really mean is are GIS-made choropleth map legends doing enough for us in their current form - and is there an opportunity for us to add some new functionality which enhances the communicative power of the humble choropleth legend? An example... look at the map below, which I created in QGIS. It's a map of a new deprivation* dataset for England, focused on the local authority of Birmingham.

Deprivation choropleth, with legend and inset map

This dataset is typically understood and discussed in terms of deciles, hence the classification used above. The dataset goes from decile 1 (most deprived) to decile 10 (least deprived) - within the context of England as a whole. Cities like Birmingham tend to have a higher proportion of their small areas in the most deprived decile, and in map form this results in lots of red and not much blue, as you can see above. If you wanted to find out how many areas were in decile 1 (most deprived) you would know that it was 'a lot' but because the inner-urban areas tend to be smaller in size (relative to the blue ones), making an accurate assessment visually is quite difficult. In fact, owing to the different sizes of the spatial units, you could quite easily take the wrong message away from a choropleth like this.

My solution? Make the legend do more work. Make it tell us not just what the colours represent but also what proportion of areas are in each category by scaling the colour patches relative to the proportion of areas in each choropleth class - in the form of a bar chart - what I call a 'bargend' (jump in at this point if you already have a name for this). You could, without much effort, add in a table or a separate chart, but I want the legend to actually be the bar chart. In part, I was inspired to attempt this in QGIS because of Andy Tice's prototype scatterplot layout and his comment that he'd like to get it working in the QGIS Atlas tool. Here are some results, followed by further thoughts.

This time, I've added in a 'bargend'

A closer look at the bargend for Birmingham

When I do a visual comparison of the Birmingham map, I'm surprised that the least deprived (i.e. richest) areas only account for 1.7% of the total, because I'm drawn to the blue of the choropleth. This could be solved though a cartogram approach, but I wanted to preserve geographical accuracy here. I'm not surprised that almost 40% of areas are in the poorest decile - that's what I'd expect from what I know about deprivation in English inner-cities. Let's look at another example below.

The London Borough of Tower Hamlets

This time I've shown one of the poorest parts of London - Tower Hamlets. An interesting aside here is the emergence of one area in decile 9 (i.e. richer area) compared to the pattern from 2010. This is almost certainly linked to gentrification and displacement rather than individuals becoming 'less deprived'. I find the extra information provided in the bargend very useful analytically/cognitively compared to the simple legend we would normally use.

Now let's look at a few more...

Liverpool contains relatively few 'non-deprived' areas

Like Liverpool, Manchester has many poor areas

Middlesbrough has the highest % in the most deprived decile

One of the benefits of this approach, in my view, is when you compare different places - you can click on an image above and then go forward and backward to make comparisons. The added value of the bargend approach means that you have precise details of the proportion of areas in each decile and you can make more meaningful comparisons. You could just do this with a table or chart and dispense with the map altogether, but then you'd lose the very important ability to identify where precisely individual areas are and where spatial concentrations of deprivation (and affluence) exist. Talking of affluence, it's only fair that I show you some maps of places that are at the opposite end of the scale. Two prime examples...

A beautiful part of the world, but very blue

Hart, you almost broke my chart (highest % in decile 10)

I'll wrap up with a few points.

1. I'd love it if someone could find a way to add in this functionality natively in QGIS. I had to do a bit of thinking and tinkering to automate this in the Atlas tool, but I now have it working well and everything dynamically updates and re-positions itself once you set it up.

2. I wouldn't always want to use a bargend, but I think it's something that adds value without taking up much more space (if any) in map layouts.

3. I'm trying to think of any drawbacks of this approach, but I can't. I'm happy for others to chip in with ideas on this.

4. I think 'bargend' is a terrible word. Please tell me it already has a nice sounding name, or invent one for me. [update: in my rush to coin a phrase, and because I was mapping deciles as categories - as in a bar chart - I was thinking about bar charts rather than histograms. This is really a histogram but it uses named categories (deciles) which in theory could be re-ordered and the chart would still make sense, so perhaps the bargend retains qualities of both and, anyway, a histogram still uses bars]

5. Are map legends too lazy? Not really, but they can sometimes work harder.

Andrew Wheeler very kindly got in touch to share a few relevant papers on the subject. The Kumar paper is very close to what I propose (though he does the chart for the entire dataset rather than a subset) and he calls it a 'Frequency Histogram Legend' - more accurate perhaps, but less catchy. The Dykes et al. paper is very interesting and I like the treemap approach.

Hannes (@cartocalypse) also got in touch to say he likes the idea and he's suggested 'legumns', which is also useful (but more difficult to pronounce!).

I'll add more on the topic if people respond.

* Just in case the use of this word sounds odd to you, we use the word 'deprivation' in the British context in studies of urban poverty/disadvantage but it's not exactly the same thing. I've written about this in previous academic papers but to all intents and purposes more deprived means 'poorer' and less deprived means 'richer'. In the maps above, you could say red: poor and blue: rich and you wouldn't be wrong (ecological fallacies notwithstanding).

Monday, 14 September 2015

The Shapes of Cities

For a long time, I've been interested in the shape of cities and I suspect that if you're reading this you might be similarly afflicted. By 'shape' I mean their political boundaries as opposed to their general urban footprint. The latter can be seen by driving around or from a plane window, particularly when it's dark, but the political boundaries are much less obvious. This is particularly true of US cities. Take Houston, Texas - the first example below. 

The boundary of the City of Houston - Google Map

Look closely at this and - at least if you're not used to the political geography of American cities - you might be very confused by this fragmented, segmented mess of boundaries. Then go to Google maps and try different search terms, such as 'city of los angeles' or 'city of columbus' (Ohio) and you'll soon discover that Houston isn't that unusual at all. Try it for other cities and you'll see what I mean. Columbus, Ohio is a particular favourite of mine as I know it quite well having lived there for a couple of years in the early 2000s.

City of Los Angeles - Google Map

City of Columbus (Ohio) - Google Map

These unnatural-looking boundaries are the result of a complex mix of geography, history and politics that have real impacts on the ground. From education and transport to housing and waste management, the shapes of cities really do matter in this respect. Of course, this is a much-studied topic in urban studies, not least by Professor John Parr of the University of Glasgow. In Parr's economic definitions of the city, he outlines four types - but none of these explain the kinds of boundaries we see above. One major explanatory factor in all of this, of course, is tax revenue. But I'm not going to get into that now because it opens up a whole range of other topics, including white flight, suburbanisation, and schooling, amongst other things. The point is that the 'shapes' of cities are not accidental and who is included or excluded is inherently political. 

In the United Kingdom, we might not have such unusual city boundaries, but the political geography of our cities is far from perfect - perhaps one reason for the resurgence of the 'city-region' concept over the past decade or so. When we're talking about urban economies, it makes much more sense to think about the functional urban area than it does to use data associated with an arbitrary political shape. This is as true in the US as it is in the UK. The example below shows that the City of Atlanta has less people than the City of Liverpool and that it's only slightly bigger in scale. But anyone who knows anything about these places will understand that 'Atlanta' is much bigger than 'Liverpool and is vastly more sprawling, with a metropolitan population of around 5.5 million compared to less than a million in 'Liverpool' (by one definition). 

Atlanta vs Liverpool - which is bigger?

These kinds of issues are part of the reason organisations like the Centre for Cities use the Primary Urban Area definition of cities for the 64 largest urban areas of the UK. In a recent study, I used a definition developed by Geolytix which is based on the 'sprawl' of the urban area rather than political boundary and found this to be a much better fit than the administrative area. When conducting comparative analyses of cities, we need to ensure we are comparing like with like, and using a functional definition often helps avoid the kinds of underbounded/overbounded problems that arise when (e.g.) comparing places like Manchester and Leeds. The former is normally said to be 'underbounded' because the functional urban area is much bigger than the local authority area of the same name and the latter is said to be 'overbounded' because of its much wider local authority area, which extends beyond the core urban fabric. For a comparison of UK 'city' sizes, see this graphic I produced a few years ago:

All cities shown at the same scale
Surely there's a point to all of this? 

Yes, glad you asked...

For planners, politicians, residents and neighbours, the shapes of cities matter enormously. It might dictate which school your children can go to, whether your local facilities are well funded, whether you have a well-functioning local transport system, when your bins get emptied, how many pot-holes you have in your street and all sorts of other things. But let's not get into that now. Instead, I'll end with another city shape, this time for the City of Detroit (one of my favourite cities, but much-maligned).

Detroit - 8 mile boundary line to the north

Thursday, 10 September 2015

From mega-regions to mega commutes: US commuting working paper

My previous post provided some images from a recent piece of work I did on mapping tract-to-tract commuting patterns in the contiguous United States. This post provides a bit more background and extracts from a working paper, plus some of the original map outputs from the project - which are different in style (kind of a night time view). The focus is also more on mega commutes and mega-regions (think Gottmann's 'megalopolis'). I also provide a bit more detail on the method and data.

A constellation of cities in the Midwest

Being a member of the Regional Studies Association for a good few years now, I've followed various debates about regions, city-regions and mega-regions - including the very interesting work on mega-regions by the America2050 project of the Regional Plan Association. I also have a longstanding interest in commuting flows (and mapping them) so I set myself the challenge of mapping micro-level commuting flows in the contiguous United States in the hope of identifying what I expected would be some interesting mega-region commutes. I also hoped, in the context of this data, that I would discover some of the mega commutes identified by Rapino and Fields of the US Census Bureau. On both counts I wasn't disappointed. The first map below shows the entirety of the lower 48 states and the commuting patterns come out quite clearly.

Journeys to work in the contiguous United States

Obviously, some areas are more interesting than others, so I zoomed in on various areas, including California and the Northeastern United States. The map below shows travel to work patterns in California, and you can clearly see the wider Los Angeles metro area as one large commuter region, the Bay Area as another (but more polycentric), and also the settlement and journey to work patterns in the Central Valley, from Redding in the north down to Bakersfield in the South. This shows the urban settlement patterns in the state of California, but also the spatial configuration of the commuting connections between places.

If you take a closer look at the working paper behind these maps you'll find out more about the data. What I found most interesting were the locations where 'mega commuting' was prevalent, so I looked at the top 20 Census tracts in the Northeastern US with the highest number of people commuting there - i.e. over 50 miles each way. As you can see from the table below, this is dominated by New York City, but Washington DC also features. The total volume may not seem much, but remember that these are quite small Census tracts, with only a few thousand people.

Mega commuting in the Northeastern United States

I then did something slightly different - I wanted to filter the data in a more scientific manner. Since the data provided by US Census Bureau includes a margin of error (MOE) value for each individual tract-to-tract flow, I calculated the coefficient of variation for each individual flow line (there were just over 4 million). These were based on a 90% confidence level, so the formula was simply:

((MOE/1.645)/Commuting Estimate) x 100

I used a rather generous cut-off and then displayed only those flows which had a coefficient of variation of less than 40. The results are shown in the map below. We can see the expected pattern of commuting but - hold on a minute - what are those really long distance lines? Surely people don't 'commute' vast distances like this. Well, it turns out that this might actually be true because many of these lines begin and end in military locations or other places associated with regular, long distance moves for work and since the American Community Survey asks respondents how they usually got to work ‘last week’, it's entirely plausible that a number of people will work away from home and that this will lead to the kinds of patterns we see below. Or, to put it another way, don't think of these long lines as journeys people travel every day! 

If you want to read more about it, you can click below to see the working paper, which also includes links to high resolution versions of the images shown here.

American Commute: working paper

Friday, 28 August 2015

Mapping the American Commute

Update, 20 September 2015: scroll to the bottom of the post if you want to download the data.

One of my summer projects this year has been attempting to map the American commute, following earlier work on a similar subject. Put simply, I've attempted to put together a map which shows commuting connections between locations in the contiguous United States, using the most fine-grained data I could find. Some of the results of this went into a recent piece in WIRED, and also CityMetric, and the larger piece of work it's based on is part of on-going research into the best ways of mapping commuting flows. The main images are below, followed by some more technical information. For now, all you need to know is that these images show commuting connections of 100 miles or less between Census tracts in the lower 48 states. You'll have to forgive me if your city isn't labelled! 

Higher resolution image available here

And now some zoomed in versions...

Zoom in of the west coast

Texas, and beyond!

Interesting patterns of connectivity in the Midwest

Look closely for some interesting inter-connections

The famous BosWash megalopolis

But this just shows where people live, doesn't it? Yes it does. But it also shows how the places where people live connect with other places from a functional economic point of view, at a fairly fine-grained level. It offers a slightly different view than just looking at the urban fabric alone which, I might add, is interesting in itself. Mapping flows like this is not exactly new, as this paper from Arthur Robinson (1955) on Henry Drury Harness (1837) demonstrates. Nonetheless, I haven't seen anyone map travel to work at this resolution for the United States, so I thought I'd have a go myself. 

If you spend some time looking at the big version of the map you can begin to see how places connect and where there are obvious disconnections, even between places that are not that far apart. One thing that you can pick up from the complete dataset (but not this batch of maps) is the growth of mega-commuting, as explained by Melanie Rapino and Alison Fields of the United States Census Bureau. 

Background information: the data I used is the most recent tract-to-tract journey to work dataset from the American Community Survey. This dataset covers journeys to work between the c74,000 census tracts in the United States and the complete dataset has around 4million interactions. I mapped this in QGIS, using methods I've described previously on this blog. The tricky bits were dealing with the messy FIPS codes, dealing with the size of the dataset, and trying to decide what to label. There is quite a bit of error in the dataset (as acknowledged by the ACS people) and each individual flow line has a margin of error value associated with it, from which I also calculated the coefficient of variation. This is explained in a more detailed working paper, which I expect to publish in the coming months.

Update, 20 September 2015: there has been quite a bit of interest in the underlying dataset I put together to create the maps, so I have decided to make the whole shapefile available here in the hope that others will find it useful and be able to produce some interesting analysis or visuals from it. I'm hoping someone will do a cool interactive web map of it, but it might be quite technically challenging. If you do use it, make sure you read the associated working paper, which explains the process and the underlying data. One word of warning: the uncompressed file is pretty big so you'll need a good computer.

Mapping the American Commute: download the data (213MB, zipped shapefile)

Tuesday, 4 August 2015

"The Regional World", version 2

I recently came back to CartoDB to do a bit of experimenting for some GIS work I'm doing this autumn, so I decided to revisit a topic I looked at before: sub-national regions of the world. In a previous version I posted via Twitter I took sub-national boundaries of the world and put together an interactive map (in about 15 minutes, so it wasn't very good). I've now produced a better one. It's not perfect but I have managed to add in an equal area projection version and other simple features - such as scale-dependent labelling and line styling.

The Regional World - version 2

According to Wikipedia, the largest sub-national divisions in the world are the Sakha Republic (Yakutiya) in Russia, Western Australia, and Krasnoyarsk Krai, also in Russia. The first two are more than ten times the size of the UK (which is 244,000 sq km) and number three almost is. If you click on the link above to go to the map then you'll see that you can also click on the equal area version. I did this because web maps often default to the Mercator projection, which causes massive distortion towards the poles and leads people into thinking Greenland is bigger than Africa, which of course it isn't.

The Regional World - equal area projection

The equal area projection does of course mean that areas towards the poles are extremely distorted, but that's part of the deal with some map projections. I've taken the administrative boundaries at face value, but of course they may not be 100% accurate, as the authors of the data acknowledge:

"This is the toughest dataset to keep current. Unlike the United States, other countries constantly rearrange their admin-1 units, slicing and combining them on a regular basis."

Read more about the data

You'll notice that I have put links to a small number of countries on the main map. I chose these because I find them interesting, that's all. This was part experiment with CartoDB and a little SQL (projection) and CSS (scale-dependent styling), part GIS project, part teaching material, partly driven by my interest in regions more generally, and part pre-holiday wind-down. In relation to the latter, just for fun, I have hidden two little artefacts in the main map that only appear when you zoom to a certain level at two places on earth. 

Can you find them? 

Answers via Twitter or e-mail...

Monday, 20 July 2015

Urban footprints: some building outline data sources

This is an informational post about where to find building outline data, which I've used a lot in previous GIS projects. It might also be of interest to architects, engineers and anyone interested in urban studies and planning more generally. I like using this kind of data to explore cities as it gives us a good idea of the layout of the urban fabric, as in the example below (New Orleans). The links mainly refer to data from the US, Canada and Great Britain but other parts of the world are covered to various extents by OpenStreetMap.

New Orleans

Let's start big, with OSM... Steve Bernard has produced an excellent video which explains how you can get OpenStreetMap data directly into QGIS very simply - he uses Madrid in the example. The accuracy and coverage varies a great deal across the world, so you need to bear this in mind when downloading and using it - but on the whole it is a fantastic resource. The example below shows Mogadishu, where the coverage is incomplete for buildings but pretty good for the road network. 

© OpenStreetMap contributors

Another useful OSM-related resource with decent global city coverage is CAD Mapper, where you can download areas up to 1km square for free. However, I'm focusing on open data today so will not go into detail on this. The best OSM download source is I think GEOFABRIK (German for 'geo factory), a German GIS consultancy who extract and process OSM data and then make it available for free online. It's really nicely structured and easy to find what you're looking for. Here's the download page for New Zealand, for example - followed by the contents, where you can see the building data on top of a current OSM base map. At time of writing, the zipped shp folder for the whole of New Zealand was 146MB.

The New Zealand GEOFABRIK download page (20 July 2015)

Auckland, NZ - very good building coverage here

The OSM sources are great, since the licence is very generous and you can use the data for just about anything, so long as it's properly cited. However, many towns, cities and counties across the world also provide building footprint or outline data (the terminology varies from place to place) so I've put together a list below of ones I know about. Some of them (e.g. Detroit, NYC) cover land parcels or tax lots so are slightly different but in the main it's just building outlines. I've included visuals for some of the datasets, so you can get an idea of what they look like.

New York City - from the BYTES of the BIG APPLE website you can download the MapPLUTO dataset, for all 5 Boroughs in New York City. Tax lot level rather than building outlines, but it's an extremely rich dataset with loads of useful land use planning variables in it, including 'year built' and number of floors. A little sample of the data are shown below, for the area around Central Park.

A little sample of the data (using Qgis2threejs)

Chicago - the building footprints layer is avaiable in two versions online, one of which says it is deprecated but I've heard from the Chicago GIS team that this isn't the case. It's just that due to limited staff the dataset is only edited when necessary. Also contains a 'year built' and height variable.

San Francisco - another really good city buildings dataset, from SF OpenData. Also lots of useful variables in this dataset, including height. I really like this one.

Dallas - you'll probably get a disclaimer box in a pop-up when you go to download this. I've linked to the general GIS page and the file you want is called Structures (Building Footprints) in the Planimetric Data section - it's about 81MB to download and the unzipped file is well over 100MB.

Atlanta - again, I've linked to the GIS page, this time from the City of Atlanta and you need to download the 'Impervious Buildings' layer. If you're looking to map the sprawl of Atlanta, this won't work as it covers the City area only. Still, a very useful dataset.

Denver - excellent open data from Denver. This dataset covers all permanent structures and buildings for a 152 square mile area of the City and County of Denver. Available in a number of different formats.

Seattle - this dataset was created in 2009 by Pictometry International Corp but is now in the public domain. It is available via the City of Seattle's data website.

Los Angeles - this is a fantastic dataset for the County (not just the City) of Los Angeles, which is the most populous county in the United States (just over 10 million). Made available via the LA County GIS Data Portal. It is a little hefty (581MB) so be careful! In the example below I show all the buildings in LA County but the City of Los Angeles in dark shading, just to emphasise its crazy shape.

Boston - this was created in 2012 and is available via the City of Boston. Contains a number of different fields, including base elevation of the structures, the elevation of the highest point above sea level and fields on building type.

Detroit - like New York, not strictly a buildings outline file but instead a property lot level dataset. Very impressive dataset produced by Data Driven Detroit's Motor City Mapping project. I've used this data a lot in talks and teaching as it's a really good example of its type.

Now some links to further datasets which I know of but haven't used that much...

Washington DC - link is to the download page, but direct link to zip is here (559MB unzipped)

Baltimore - the top link on this page

Philadelphia - via OpenDataPhilly

Massachusetts - buildings for a wide range of towns and cities in the state

Boulder - this is from Boulder County, Colorado. Available in a number of different file formats.

Bloomington, Indiana - one of many smaller cities with excellent geodata

New Orleans - an excellent dataset, not just because of the unusual shape of the city!

Toronto - don't be confused by the '3D Massing' terminology here. Scroll down to the 'Data download' section

Vancouver - doesn't cover the whole city and they were digitised in 1999 but still a useful dataset.

Waterloo - this is from the Region of Waterloo and was up to date as of January 2014.

Hobart, Tasmania - an nice example of building data from Hobart in Australia. Contains a 'year constructed' variable.

Wellington, NZ - can't overlook New Zealand! I think you need to register to download this but it's Creative Commons 3 so still open. 

The list wouldn't be complete without mentioning OS OpenData for Great Britain, provided by Ordnance Survey. A new dataset with detailed buildings became available in March (the OS Open Map - Local) dataset. The building data is a very small part of this collection but one I find very interesting. I've patched together a few cities here to get the ball rolling but you can download your own. There's also a 'tile finder' to help you identify which OS tile you need to cover your area of interest. 

This could save you some time 

I think this just about covers it. Get in touch if you have any other great data sources for building outlines.

Sunday, 12 July 2015

Mapping the Polycentric Metropolis: journeys to work in the Bay Area

I’ve recently been writing and thinking about polycentric urban regions, partly because I’m interested in how places connect (or not) for one of my research projects, and partly because I’ve been experimenting with ways to map the connections between places in polycentric urban regions. There was quite a lot of the latter in Peter Hall and Kathy Pain’s ‘The Polycentric Metropolis’ from 2006 but given that the technology has moved on a little since then I thought I’d explore the topic in more detail. Mind you, I’ve also been looking back on Volumes 1 to 3 of the Chicago Area Transportation Study of 1959 as a reminder that technology hasn’t moved on as much as we think – their ‘Cartographatron’ was capable of mapping over 10 million commuting flows even then (though it was the size of a small house and required a team of technicians to operate it – see bottom of post for a photo).

Are you part of the big blue blob?

Anyway, to the point… What’s the best way of mapping polycentricity in an urban region? For this, I decided to look at the San Francisco Bay Area since it has been the subject of a few studies by one of my favourite scholars, Prof Robert Cervero of UC Berkeley. Also, a paper by Melanie Rapino and Alison Fields of the US Census Bureau identified the Bay Area as the region with the highest percentage of ‘mega commuting’ in the United States (traveling 90 or more minutes and 50 or more miles to work). Therefore, I decided to look at commuting flows between census tracts in the 9 counties of the Bay Area, from Sonoma County in the north to Santa Clara County in the south. I’ve used a cut-off of 30 miles here instead of the more generous 50 mile cut-off used by Rapino and Fields. I also mapped the whole of the United States in this way, but that’s for another day.

The series of maps below illustrate both patterns of commuting in the Bay Area and the different approaches I’ve taken in an attempt to capture the essence of polycentrism in the area. I don’t attempt to capture the misery of some of these commutes, since for that I’d need a different kind of technology. But, I do think the animations in particular capture the polycentric nature of commuter flows. If you’re represented by one of the dots in the images below, thanks a lot for taking part!

Let’s start with a simple representation of commutes of over 30 miles from San Francisco County (which is coterminous with the City of San Francisco). The animated gif is shown below and you can click the links to view the sharper video file (mp4) in your browser (so long as you're on a modern browser). The most noticeable thing here is the big blue blob© making its way down from San Francisco to Palo Alto, Mountain View and Cupertino in Santa Clara County. In total, the blue dots represent just over 15,000 commuters going to 803 different destination census tracts. I’m going to take a wild guess and suggest that some of these commutes are by people who work at Stanford, Google and Apple. But it probably also includes people working at NASA Ames Research Center, Santa Clara University and locations in San Jose. 

View video file in browser - or click image to enlarge gif

These patterns aren’t particularly surprising, since there has been a lot of press coverage about San Francisco’s bus wars and commutes of this kind. However, there is a fairly significant dispersal of San Francisco commuters north and east, even if the numbers don’t match those of the big blue blob. By the way, from San Francisco it's about 33 miles to Palo Alto, 39 miles to Mountain View, 42 to Cupertino and 48 to San Jose. 

The first example above doesn’t reveal anything like the whole story, though. There are actually quite a lot of commuters who travel in the opposite direction from Santa Clara County to San Francisco but more widely the commuting patterns in the Bay Area – a metro area of around 7.5 million people – resembles a nexus of mega-commuting. This is what I’ve attempted to show below, for all tract-to-tract connections of 10 people or more, and no distance cut-off. The point is not to attempt to display all individual lines, though you can see some. I’m attempting to convey the general nature of connectivity (with the lines) and the intensity of commuting in some areas (the orange and yellow glowing areas). Even when you look at tract-to-tract connections of 50 or more, the nexus looks similar.

Click image to view larger version

Stronger connections - click image to view larger version

If we zoom in on a particular location, using a kind of ‘spider diagram’ of commuting interactions, we can see the relationships between one commuter destination and its range of origins. In the example below I’ve taken the census tract where the Googleplex is located and looked at all Bay Area Commutes which terminate there, regardless of distance. In the language of the seminal Chicago Area Transportation Study I mentioned above, these are ‘desire lines’ since this represents ‘the shortest line between origin and destination, and expresses the way a person would like to go, if such a way were available’ (CATS, 1959, p. 39) instead of, for example, sitting in traffic on US Route 101 for 90 minutes. According to the data, this example includes just over 23,000 commuters from 585 different locations across the Bay Area. I've also done an animated line version and a point version, just for comparison.

Commuting connections for the Googleplex census tract

Animated spider diagram of flows to Mountain View

Just some Googlers going to work (probably) mp4

Looking further afield now, to different parts of the Bay Area, I also produced animated dot maps of commutes of 30 miles or more for the other three most populous counties – Alameda, Contra Costa and Santa Clara. I think these examples do a good job of demonstrating the polycentric nature of commuting in this area since the points disperse far and wide to multiple centres. Note that I decided to make the dots return to their point of origin – after a slight delay – in order to highlight the fact that commuting is a two way process. The Alameda County animation represents over 12,000 commuters, going to 751 destinations, Contra Costa 25,000 and 1,351, and Santa Clara nearly 28,000 commuters and 1,561 destinations. The totals for within the Bay Area are about 3.3 million and 110,000 origin-destination links.

Alameda County commutes of 30+ miles mp4

Contra Costa County commutes of 30+ miles mp4

Santa Clara County commutes of 30+ miles mp4

Finally, I’ve attempted something which is a bit much for one map, but here it is anyway; an animated dot map of all tract-to-tract flows of 30 or more miles in the Bay Area, with dots coloured by the county of origin. Although this gets pretty crazy half way through I think the mixing of the colours does actually tell its own story of polycentric urbanism. For this final animation I’ve added a little audio into the video file as well, just for fun.

A still from the final animation - view here

What am I trying to convey with the final animation? Like I said, it's too much for a single map animation but it's kind of a metaphor for the messy chaos of Bay Area commuting (yes, let's go with that). You can make more sense of it if you watch it over a few times and use the controls to pause it. It starts well and ends well, but the bits in the middle are pretty ugly - just like the Bay Area commute, like I said.

My attempts to understand the functional nature of polycentric urbanism continue, and I attempt to borrow from pioneers like Waldo Tobler and the authors of the Chicago Area Transportation Study. This is just a little map-based experimentation in an attempt to bring the polycentric metropolis to life, for a region plagued by gruesome commutes. It’s little wonder, therefore, that a recent poll suggested Bay Area commuters were in favour of improving public transit. If you're interested in understanding more about the Bay Area's housing and transit problems, I suggest watching this Google Talk from Egon Terplan (54:44).

Notes: the data I used for this are the 2006-2010 5-year ACS tract-to-tract commuting file, published in 2013. Patterns may have changed a little since then, but I suspect they are very similar today, possibly with more congestion. There are severe data warnings associated with individual tract-to-tract flows from the ACS data but at the aggregate level they provide a good overview of local connectivity. I used QGIS to map the flows. I actually mapped the entire United States this way, but that’s going into an academic journal (I hope). I used Michael Minn’s MMQGIS extension in QGIS to produce the animation frames and then I patched them together in GIMP (gifs) and Camtasia (for the mp4s), with IrfanView doing a little bit as well (batch renaming for reversing file order). Not quite a 100% open source workflow but that’s because I just had Camtasia handy. The images are low res and only really good for screen. If you’re looking for higher resolution images, get in touch. It was Ebru Sener who gave me the idea to make the dots go back to their original location. I think this makes more sense for commuting data.

The Cartographatron: Information and images on the 'Cartographatron' used in the Chicago Area Transportation Study (1959) are shown below.

From p.39 of CATS, 1959, Vol 1

From p.98 of CATS, 1959, Vol 1