using the libraries taskforce dataset to create visualisations
Contains OS data © Crown copyright and database right (2017). Contains Royal Mail data © Royal Mail copyright and Database right 2017. Contains National Statistics data © Crown copyright and database right 2017. Library data released by the Libraries Taskforce under the Open Government Licence.
On 30th March 2017 the Libraries Taskforce released Public Libraries in England: Basic Dataset. This listed all public libraries in England with the following fields:
The data was collected from library authorities in mid-2016. It includes all types of public libraries: real ones, volunteer, neighbourhood collections, etc.
One limiting factor was location data. For each property in the UK, Ordnance Survey (OS) hold an address, co-ordinates, and a Unique Property Reference Number (UPRN). This data is not open data, though aspects of it can be made open (such as UPRN) within derived datasets. It is available as part of a premium product called AddressBase, which is free for public sector organisations, but costly for others.
The OS have a presumption to publish policy for open data. For public asset datasets, that do not 'substantially copy' their licensed data, it can be presumed these can be open data. In the case of around 3000 libraries, it is unlikely that releasing the locations would affect the commercial viability of AddressBase, or substantially copy it. It would be great if any future library dataset could publish the 'official' co-ordinates, property references, and addresses of library locations, as well as obtaining an exemption from the OS to allow this.
This tutorial will map those libraries using the current dataset and freely available open data. This can be surprisingly difficult. Those familiar with online mapping tools (like Google), will know you normally just upload the address spreadsheet, it magically geocodes it all and shows the map. That's true (ish), but there are normally licence restrictions. Google will geocode addresses for you, but you MUST plot the points on a Google map and use their services, and you cannot then make the co-ordinates open data.
Carto (previously known as CartoDB) is an online map mapping service. The free tier of membership restricts how much data you can store and doesn't allow private data.
It works by uploading data into it's web interface. As long as this data has some kind of geography component (such as co-ordinates) you can view maps, and optionally embed those maps into websites.
Carto has automatically added columns for 'ID' and 'the_geom'. 'the_geom' is the column that will store the location (geometry) of each library. There is no recognised location data in the uploaded file, so Carto will currently show these as 'null'.
It also looks like the column names aren't correct. That's because the original data looks like this:
|Static Public Libraries (open at 1 July 2016)||Location|
|Library service||Library name||Postal address||Postcode||Contact email||URL of library website|
|Bolton||Highfield neighbourhood collection (The Orchards)||The box under the table, The corner, The Foyer, Highfield (The Orchards), Highfield Road, Bolton||BL4 0RNfirstname.lastname@example.org||a URL|
It seems to have two header rows, the second one being the 'real' column headers, and the first one only having two entries. Carto uses the first one for it's column headers and fills in missing headers with default ones. Correcting this would mean downloading the file from gov.uk, deleting the first header row in a text editor (like notepad) and then uploading it as a dataset manually into Carto. Or just leaving it and not worrying too much. Will leave it as is for now, it'll just mean not referring to meaningful column names later on.
Code Point Open is an Ordnance Survey open data product. It provides a list of all current valid postcodes in Great Britain with their co-ordinates. Because a postcode can contain many properties these co-ordinates represent the centre of the postcode. That causes some issues, a bit like when you put a postcode into a SatNav but end up a mile away from your destination, maybe with a river and motorway in the way. So it's not completely accurate. As libraries are pretty important buildings, they often have a postcode all to themselves. For those ones the location data will be very precise, right on top of the building.
Code Point Open data comes in a zip file with a file for each postcode area (e.g. BA.csv). That's a little tedious. We just want to upload everything in one go, so need to combine the files into one. The data is also a little large, and on the free version of Carto we're limited to 250Mb. That 250Mb also doesn't necessarily relate to the size of the CSV you want to upload. It's about how much space the data occupies within a database structure. What we'd need to do is delete columns from the CSV that we don't need. As we're also dealing just with England we could filter the postcodes to only include those from England.
To save some time, there's a CSV file on the libraries hacked GitHub account that's already been processed, so you don't need to download code point open or mess around with it.
Because it's so large it's worth indexing the postcodes dataset, which should improve performance.
CREATE INDEX postcodesindex ON postcodes_england (postcode)
This tells the underlying database to create an index (called 'postcodesindex') on the 'postcodes_england' table, using the column called 'postcode'.
So with both our library dataset and official postcodes uploaded, some data checking can be done. Yay!
SELECT * FROM libraries
This tells us that the current data view is showing all columns (*) from the libraries dataset. We can change this to a view that only shows the libraries where the library postcode is NOT in the postcodes dataset (it's probably wrong).
select field_2, field_4 from libraries where replace(field_4, ' ', '') not in (select replace(postcode, ' ', '') from postcodes_england) order by field_2
The replace() function around each postcode just removes the spaces when checking for matches, to allow for situations where the spacing may be different (e.g. 'BA11RG' and 'BA1 1RG'). 'Field_4' is the field that would have been called 'postcode' in the Taskforce dataset if it didn't have the extra header row. 'Field_2' is the library name, which we also use for alphabetical ordering in the query above.
There are 49 rows returned. These are shown below, with the reasons why they're wrong. (Carto doesn't work this out, I had to look em all up on google!)
|Library name||Dataset postcode||Actual postcode||Reason|
|Adswood and Bridgehall||SK8 8NR||SK3 8NR||Typo|
|Allington Library||ME6 0PR||ME16 0PR||Typo|
|Alresford||S024 9AQ||SO24 9AQ||Typo|
|Appleby||CA16 1QP||CA16 6QP||Typo|
|Arcadia library and leisure centre||M19 3PH||M19 3AF||Postcode no longer in use|
|Bethnal Green library||E2 0HL||E2 0HW||Postcode no longer in use|
|Bingley||BD16 1AW||BD16 1AJ||Postcode no longer in use|
|Boscombe||BH1 1BY||BH5 1BY||Typo|
|Brampton||CA8 8NX||CA8 1NW||Typo|
|Burton Book Drop||LA6 7NA||LA6 1NA||Typo|
|Chelmsford||CM1 1LH||CM1 1QH||Typo|
|Consett||DH8 5AT||DH8 5AP||Postcode no longer in use|
|Corby||NN17 1QJ||NN17 1QG||Typo|
|Crowborough Library||TN6 1DH||TN6 1FE||Postcode no longer in use|
|Daventry||NN1 4GH||NN11 4GH||Typo|
|Diss Library||IP22 3DD||IP22 4DD||Typo|
|Failsworth Library||M35 0FJ||M35 0AE||Postcode no longer in use|
|Farley Community Centre||LU2 5RE||LU1 5RE||Typo|
|Forest Hall||NE12 0LJ||NE12 7JY||Postcode no longer in use|
|Frinton||C013 9DA||CO13 9DA||Typo|
|Garforth library and one stop centre||LS25 1DU||LS25 1EH||Typo|
|Highley||WV16 6GH||WV16 6JG||Typo|
|Holderness Road Customer Service Centre Library||HU9 2AH||HU9 2BN||Postcode no longer in use|
|Keighley||BD21 3SX||BD21 2AT||Typo|
|Littlemore||OX4 5JY||OX4 6JZ||Typo|
|Long Ashton||BS4 9AH||BS41 9AH||Typo|
|Lymm||WA13 5SL||WA13 0QW||Typo|
|Mansfield Woodhouse||NG18 8AH||NG19 8AH||Typo|
|Newark||NG24 1U||NG24 1UW||Typo|
|Newbiggin Hall Library||NE5 4BZ||NE5 4BR||Typo|
|Newington Library||CT12 6NB||CT12 6FA||Postcode no longer in use|
|Newton Aycliffe Library||DL5 5QG||DL5 4EJ||Postcode no longer in use|
|Offerton||SK2 5NX||SK2 5NW||Postcode no longer in use|
|Old Marston||OX3 3PH||OX3 0PH||Typo|
|Park Library||SN3 2LP||SN3 2LZ||Typo|
|Paulton||BS29 7QG||BS39 7QG||Typo|
|Preston||HA9 8PL||HA9 8LX||Postcode no longer in use|
|Primrose library||NE32 34AU||NE32 4AU||Typo|
|Redbourn||AL3 3JQ||AL3 7BP||Changed|
|Shard End Library||B34 7AG||B34 7AZ||Postcode no longer in use|
|Shiremoor||NE27 OHJ||NE28 9JQ||Typo|
|Skelton Library||TS12 2HN||TS12 2HP||Typo|
|Thornaby Central Library and Customer Service Centre||TS17 9EU||TS17 9EW||Typo|
|Thrapston||NN14 4SJ||NN14 4JJ||Typo|
|Wakefield One||WF1 2DA||WF1 2EB||Typo|
|Wargrave||EG10 8EP||RG10 8EP||Typo|
|Winslow||MK18 3RB||MK18 3DL||Typo|
|Worksop||S89 2BP||S80 2BP||Typo|
There will be other errors in the postcodes but these are just the ones that have resulted in an invalid postcode.
We have a load of corrections we can make to the dataset.
update libraries set field_2 = trim(from field_2); update libraries set field_4 = trim(from field_4); delete from libraries where field_2 = 'Library name'; update libraries set field_4 = 'SK3 8NR' where field_2 = 'Adswood and Bridgehall' and field_4 = 'SK8 8NR'; update libraries set field_4 = 'ME160PR' where field_2 = 'Allington Library' and field_4 = 'ME6 0PR'; update libraries set field_4 = 'SO24 9AQ' where field_2 = 'Alresford' and field_4 = 'S024 9AQ'; update libraries set field_4 = 'CA16 6QP' where field_2 = 'Appleby Library' and field_4 = 'CA16 1QP'; update libraries set field_4 = 'M19 3AF' where field_2 = 'Arcadia library and leisure centre' and field_4 = 'M19 3PH'; update libraries set field_4 = 'E2 0HW' where field_2 = 'Bethnal Green library' and field_4 = 'E2 0HL'; update libraries set field_4 = 'BD16 1AJ' where field_2 = 'Bingley' and field_4 = 'BD16 1AW'; update libraries set field_4 = 'BH5 1BY' where field_2 = 'Boscombe Library' and field_4 = 'BH1 1BY'; update libraries set field_4 = 'CA8 1NW' where field_2 = 'Brampton Library' and field_4 = 'CA8 8NX'; update libraries set field_4 = 'LA6 1NA' where field_2 = 'Burton Book Drop' and field_4 = 'LA6 7NA'; update libraries set field_4 = 'CM1 1QH' where field_2 = 'Chelmsford library' and field_4 = 'CM1 1LH'; update libraries set field_4 = 'DH8 5AP' where field_2 = 'Consett Library' and field_4 = 'DH8 5AT'; update libraries set field_4 = 'NN17 1QG' where field_2 = 'Corby' and field_4 = 'NN17 1QJ'; update libraries set field_4 = 'TN6 1FE' where field_2 = 'Crowborough Library' and field_4 = 'TN6 1DH'; update libraries set field_4 = 'NN11 4GH' where field_2 = 'Daventry' and field_4 = 'NN1 4GH'; update libraries set field_4 = 'IP22 4DD' where field_2 = 'Diss Library' and field_4 = 'IP22 3DD'; update libraries set field_4 = 'M35 0AE' where field_2 = 'Failsworth Library' and field_4 = 'M35 0FJ'; update libraries set field_4 = 'LU1 5RE' where field_2 = 'Farley Community Centre' and field_4 = 'LU2 5RE'; update libraries set field_4 = 'NE12 7JY' where field_2 = 'Forest Hall' and field_4 = 'NE12 0LJ'; update libraries set field_4 = 'CO13 9DA' where field_2 = 'Frinton library' and field_4 = 'C013 9DA'; update libraries set field_4 = 'LS25 1EH' where field_2 = 'Garforth library and one stop centre' and field_4 = 'LS25 1DU'; update libraries set field_4 = 'WV16 6JG' where field_2 = 'Highley' and field_4 = 'WV16 6GH'; update libraries set field_4 = 'HU9 2BN' where field_2 = 'Holderness Road Customer Service Centre Library' and field_4 = 'HU9 2AH'; update libraries set field_4 = 'BD21 2AT' where field_2 = 'Keighley' and field_4 = 'BD21 3SX'; update libraries set field_4 = 'OX4 6JZ' where field_2 = 'Littlemore' and field_4 = 'OX4 5JY'; update libraries set field_4 = 'BS41 9AH' where field_2 = 'Long Ashton' and field_4 = 'BS4 9AH'; update libraries set field_4 = 'WA13 0QW' where field_2 = 'Lymm' and field_4 = 'WA13 5SL'; update libraries set field_4 = 'NG19 8AH' where field_2 = 'Mansfield Woodhouse' and field_4 = 'NG18 8AH'; update libraries set field_4 = 'NG24 1UW' where field_2 = 'Newark' and field_4 = 'NG24 1U'; update libraries set field_4 = 'NE5 4BR' where field_2 = 'Newbiggin Hall Library' and field_4 = 'NE5 4BZ'; update libraries set field_4 = 'CT12 6FA' where field_2 = 'Newington Library' and field_4 = 'CT12 6NB'; update libraries set field_4 = 'DL5 4EJ' where field_2 = 'Newton Aycliffe Library' and field_4 = 'DL5 5QG'; update libraries set field_4 = 'SK2 5NW' where field_2 = 'Offerton' and field_4 = 'SK2 5NX'; update libraries set field_4 = 'OX3 0PH' where field_2 = 'Old Marston' and field_4 = 'OX3 3PH'; update libraries set field_4 = 'SN3 2LZ' where field_2 = 'Park Library' and field_4 = 'SN3 2LP'; update libraries set field_4 = 'BS39 7QG' where field_2 = 'Paulton (The Hub)' and field_4 = 'BS29 7QG'; update libraries set field_4 = 'HA9 8LX' where field_2 = 'Preston' and field_4 = 'HA9 8PL'; update libraries set field_4 = 'NE32 4AU' where field_2 = 'Primrose library' and field_4 = 'NE32 34AU'; update libraries set field_4 = 'AL3 7BP' where field_2 = 'Redbourn' and field_4 = 'AL3 3JQ'; update libraries set field_4 = 'B34 7AZ' where field_2 = 'Shard End Library' and field_4 = 'B34 7AG'; update libraries set field_4 = 'NE28 9JQ' where field_2 = 'Shiremoor' and field_4 = 'NE27 OHJ'; update libraries set field_4 = 'TS12 2HP' where field_2 = 'Skelton Library' and field_4 = 'TS12 2HN'; update libraries set field_4 = 'TS17 9EW' where field_2 = 'Thornaby Central Library and Customer Service Centre' and field_4 = 'TS17 9EU'; update libraries set field_4 = 'NN14 4JJ' where field_2 = 'Thrapston' and field_4 = 'NN14 4SJ'; update libraries set field_4 = 'WF1 2EB' where field_2 = 'Wakefield One' and field_4 = 'WF1 2DD'; update libraries set field_4 = 'RG10 8EP' where field_2 = 'Wargrave' and field_4 = 'EG10 8EP'; update libraries set field_4 = 'MK18 3DL' where field_2 = 'Winslow' and field_4 = 'MK18 3RB'; update libraries set field_4 = 'S80 2BP' where field_2 = 'Worksop' and field_4 = 'S89 2BP'; update libraries set field_4 = 'TA24 8NP' where field_2 = 'Porlock' and field_4 = 'TA24 7HD';
Whew! Quite a few updates.
We can then create a new dataset with the location data from the postcodes table, and the library data from the libraries table.
select l.static_public_libraries_open_at_1_july_2016, l.field_2, l.location, l.field_4, l.field_5, l.field_6, p.the_geom from libraries l join postcodes_england p on replace(p.postcode, ' ', '') = replace(l.field_4, ' ', '')
This goes away and creates a new dataset, which will be called 'libraries_copy'. Rename this (Rename dataset on the same three dot menu) to something more descriptive like 'librariesgeo'.
This creates a new map and takes you into the map editor tool. From here on there is lots you can do to customise the map. When you're done the map can be published with the Share button. An example is embedded at the top of this page.