About this Resource
    Resource Category:
  • Concepts
  • Methods
  • Datasets
  • Examples

Geo-Refer logo Geographical Referencing Learning Resources

Using Gazetteer Files to Rezone Data in SASPAC

Researchers sometimes need to deal with datasets that are geo-referenced to different geographies. The process normally involves grouping (e.g. aggregating) or dividing (e.g. allocating) data from one level to another. The Office for National Statistics (ONS) maintains a webpage for users to download look-up tables that provide information on Super Output Area (SOA) constitution and the associated conversions among contemporary administrative and census geographies covering England and Wales. Equivalent tables are also available for Scotland from the General Register Office Scotland and for Northern Ireland from the Northern Ireland Statistics and Research Agency. Academic users can also perform this task using the GeoConvert tool, which is ideal for small to medium sized datasets post-2006.

Example: Download SOA Look-ups from the Neighbourhood Statistics (NeSS) Site

The SOA constitution and look-up tables for England and Wales are available for download from the NeSS site. At the time of writing, this SOA resources page can be found by clicking the "About Neighbourhood Statistics" section link at the lower left corner on the NeSS homepage. The look-ups are located under Geography > Super Output Areas > SOA Lookup Files.

Download SOA Look-up files from NeSS

A range of area coding look-ups are provided. To demonstrate how data can be rezoned using gazetteer files, we rezone census data from output area to lower layer super output area (LSOA). Note that there is another layer of SOAs known as middle layer super output area (MSOA). We shall therefore need to download either Table 1) or 2) that contains the Output Area - LSOA look-up. Choose the ".txt" version, save the zip archive to a local folder and then unzip its content. There should now be a single comma-delimited text file containing the look-up data.

Creating a Gazetteer File for SASPAC using Microsoft Access

Two columns of area code are required in a valid SASPAC gazetteer file. The first column should contain codes which match the zoneids in an existing SASPAC system file (containing the data to be aggregated) while the second one is for the new zone to which data are to be aggregated. The file must not have field labels and should be saved in a "fixed-width" format. In order to make the downloaded TXT table to be ready for SASPAC, we can use Microsoft Access to reformat the data. At the same time, we can also reduce the number of records to only the areas that are required.

Start Microsoft Access. Select File > New and click the "Blank database..." icon on the New File pane. Name the blank database as "oa_lsoa_lookup.mdb" and save the file to a local folder.

Select File > Get External Data > Import... and change the "Files of type:" option to "Text Files (*.txt; *.csv; *.tab; *.asc)" to import the look-up file previously downloaded from NeSS. Ensure that comma is chosen as the field delimiter type and the first row contains field names before creating the table in Microsoft Access. Select Insert > Query and use the "Design View " option to rearrange the data. Add the table to the query and then drag and drop "OA_code", "LSOA_code" and "LA_name" from the table to the output grid. In this example, input "Newham" to the Criteria: row on the LA_name column to select only the records within the London Borough of Newham. You can substitute this with any valid local authority name. Uncheck the Show: box for this filtering column so that the results will only contain "OA_code" (the existing zone) and "LSOA_code" (the new zone).

Creating gazetter files based on NeSS look-ups in Access

Exit and save the query as "newham_oa_lsoa_lookup". Double-clicking on the query title in the Queries list should display a two-column table. The table can now be exported as a text file, select File > Export... and change the Save as type: to "Text files (*.txt; *.csv; *tab; *asc)". If required, change the saving location to be the same folder as the original TXT file. Accept the default filename ("newham_oa_lsoa_lookup") and export the data. In the Export Text Wizard dialog box, change the export format to be "Fixed Width". Don't worry if the second column of data is not visible at this stage which is due to the default use of very wide fields.

Export fixed width text file in Access

Click the "Advanced..." button to correct the field width. A new dialog box will appear to allow users to specify the start position and width of each data item. The width of the "OA_code" should be 10 but to give a white space between the data, set it to be "11". Also change the start position and width of LSOA_code to be "12" and "10" respectively. Again, the width of LSOA_code should be "9" but the extra space is added to separate this column from the optional scale factor values that can be added should users wish. Please refer to the SASPAC help for further details on the use of scale factors.

Specify the start position and width of the data items in Access for text export

Click OK to save the changes and then Next> in the Export Text Wizard dialog box to check that the data items are correctly separated.

Check the breakline between fields in a fixed width text file in Access

Click Finish to export the data. The gazetteer named as "newham_oa_lsoa_lookup.txt" should now be created in the local folder.

Rezoning Census Data using SASPAC

The text file exported can now be used by SASPAC to rezone data from output area to the lower layer of super output area. To demonstrate this, we shall aggregate the number of pre-school children in Newham from the OA to the LSOA level. A subset of the relevant Census Key Statistics (KS) table covering Newham formatted as a SASPAC system file can be downloaded by clicking [here]. The name of the system file is "NEWHAM_KEYSTATS_OA.SYS".

Save the file to a local folder. Start SASPAC. Close the "Quickstart" task menu if it is launched. Select File > New Task > Create New Zone to aggregate the variables in the KS table from OA to LSOA. The Create New Zone dialog box will appear. First Click the "Other Files..." button to open the abridged KS data for Newham. After loading the system file, click the "Gazetteer Files" button in the Method section to specify the input text file. Use the "Input from..." button to locate and open the gazetteer file. Remember to change the Files of type: to "All Files (*.*)". Specify the start and finish position of the existing and new zone. For the existing zone, the start and finish columns should be "1" and "10" whereas those for the new zone are "12" and "20". Each OA zone in this example is nested wholly within one LSOA zone so there is no need to use the scale factor column specifications. Leave all the other options unchecked and click Close to complete the input gazetteer's process. Again, please refer to SASPAC Help if you would like to explore the other available options for this command.

Specify the columns of data for a gazetteer file in SASPAC

Click the "Output to..." button and name the output file "NEWHAM_KEYSTATS_LSOA.SYS". Accept the default folder location (that is likely to be "C:\SASPAC\SYSFILES\"). Choose a simple textual label (no space or special characters) for the System File Label field.

Outputting the rezoned data using a gazetteer file in SASPAC

Click OK and the Task dialog box will appear. Accept the default settings and click OK again to execute the task. A new system file listing all the Key Statistics variables against the LSOA code in Newham should now be created and can be used directly in SASPAC.

Analysing Rezoned Data

In this example, we are going to determine the sum of pre-school children in each of LSOA. To do that, close the current log file window and select File > New Task > Output Report > Print Variables... Click the "Other Files..." button and select the newly created "NEWHAM_KEYSTATS_LSOA.SYS" system file and input "KS0020002" into the Name(s) box within the Variables section. KS0020002 is the variable containing counts of children aged between 0 and 4. Check the boxes for both "Print with Statistics" and "Print with Labels" options.

Print the aggregated pre-school children figures in SASPAC

Click OK. Accept all the default settings shown in the Task dialog box and click OK to continue. After the commands have been run successfully, a report PRN file should be created. To examine the results, close the current log file window and select File > Open > Open Print File... and open the output PRN file (most likely to be called "SASPAC.PRN" located in "C:\SASPAC\REPORT\" unless you have changed the default settings in the Task dialog box). The counts of pre-school children are now aggregated to the LSOA level as shown below.

The output

At the bottom of the file you should also find the summary statistics showing the standard deviation, maximum and minimum counts, plus the counts of missing values if there are any mismatches between the original and aggregated zones. This a plain text file so it can edited by a text editor, e.g. Microsoft WordPad, for further use.

Copyright Attribution

The Small Area Statistics Package (SASPAC) software is owned by the Improvement and Development Agency for Local Government (IDeA). AccessT and WordPad are registered trademarks of Microsoft Corporation. The software screen shots shown in this page are reprinted with permission from IDeA and Microsoft Corporation.