Thursday, 5 September 2013

The Geography of New York City

In June 2013, the city of New York released as open data one of the most detailed, fascinating and user-friendly datasets ever. The Property Land Use Tax lot Output (PLUTO) dataset is essentially a record of every parcel of land in the city, what is on it and who owns it - but this is only part of it. See the full PLUTO data dictionary for more on this. Wired said the mapping elite were 'drooling' over it and there have been a few impressive visualisations already but I was keen to look at the data in more detail and then map land use patterns and get to grips with the dataset more generally. So, as an initial experiment, I mapped all 11 land use categories for the whole city in 3D (PLUTO has a field for number of floors so the maps below are extruded on this basis). Click on an image to enlarge and then flick through the images to compare land uses.

I've also put these images in a PowerPoint file in case anyone finds it useful... These visualisations in many ways tell us what many New Yorkers already know but the PLUTO data (n.b. I've used the ready-made MapPLUTO shapefile) offers everyone for the first time the opportunity to explore this open data and examine the geography of New York City as a whole in much more detail. 

Some further information about the dataset. There are 857,879 rows in the complete dataset and the MapPLUTO version has 85 fields so if you want to work with it then you better have a good computer. When you go to the download page you'll notice that the PLUTO dataset is available as one csv file while the MapPLUTO data is split into the five boroughs of New York City. 

This is an amazing resource but it is not perfect - as the Department of City Planning recognise when they say 'PLUTO is being provided ... for informational purposes only'. The data are only as good as the sources, and sometimes when you look closely things seem a little strange. For example, here's what you get when you chart the YearBuilt column for all buildings constructed since 1800 (click to enlarge). It's hard to tell but I reckon that from about 1980 onwards the YearBuilt column is pretty accurate but before that is is something of a best estimate - though I'd be happy to be proven wrong on this!

I'll probably come back and explore this again soon but that's all for now...

Footnote: 0.4% of tax lots and 1.0% of land remains unclassified. I produced the 3D maps in ArcScene and then annotated them in GIMP. I've just done these to explore at a basic level the characteristics of the dataset and the geography of land use in New York City.