About this Resource
    Resource Category:
  • Concepts
  • Methods
  • Datasets
  • Examples

Geo-Refer logo Geographical Referencing Learning Resources

Example: Linking and Aggregating Postcoded School Data to Output Areas using Microsoft Access

The postcode is one of the most commonly-used geographical references in both primary and secondary data sources. It is quite common for users to have to match postcoded data to a higher level of geography e.g. to a set of current health authorities or areas within a census geography. This operation can be performed using Microsoft Access, where it is a specific instance of the more general task of matching two lists together. Academic users can also perform this task using the GeoConvert tool, which is ideal for small to medium sized datasets post-2006.

This example will demonstrate how a set of postcoded records can be aggregated to the output area (OA) level via a customised postcode to OA look up. The two source data files " _bypcode.csv" and "lookup_oa_pcode_xy.csv" should be among the unzipped content of the zip archive that you have downloaded at the beginning of the workshop. If you did not attend the workshop, you can click [here] to download the zip archive of the working data. One file of the archive contains the postcoded records and the other is an example of a postcode to OA code lookup table covering the same area.

Please note that an extensive postcode lookup table covering the whole UK known as the National Statistics Postcode Directory (NSPD). It is available for purchase from the Office for National Statistics (ONS). The directory is available freely to academic users through the ESRC Census Programme's UKBORDERS service.

Linking Postcoded Records with OA Codes using Microsoft Access

Start Microsoft Access. Select File > New and click the "Blank database..." icon on the New File pane. Name the blank database as " _byoacode.mdb".

Both data files are in CSV file format so they should be imported to Microsoft Access as external data using the "Text Files (*.txt; *.csv; *.tab; *.asc)" file type option. Please note that both files contain variable labels so the "First Row Contains Field Names" box must be checked. In this particular example, it is important to check that the "Postcode" fields are present in the two newly created tables and the values of the field are in the same format in each. Once successfully imported, a new Query needs to be constructed to link the two datasets together using the "Postcode" field that appears in each, so that there is a means of linking the postcoded data with the output area codes. Note carefully how many records there are in each table (this is displayed at the bottom of the window when viewing a table).

Open a new query using the Design view and add both tables. When first added to the query, these tables will by default be linked by the ID field (created automatically by Microsoft Access) but this is not meaningful as the rows in the tables do not correspond. This link may be deleted by right-clicking on it, and a new link created by dragging to connect the "Postcode" field in one table with the same field in the other. Fields from each table can be dragged and dropped into the output grid individually, or dragging and dropping the * symbol will move all fields in the table at once. Include the "Postcode", "Output Area", "SCH_NAME" and "APScore" fields. "APScore" is the average point score based on the Key Stage 2 results achieved by the primary schools. When closing the Query you will be prompted to save it, which you should do. Name the query as " _byoacode".

Double-clicking on the saved Query will run it, displaying the result of linking the two tabular datasets by their common field, the "Postcode". Each school record now contains not only its postcode and average point score, but also the code of the OA which it is fully within.

Check that the Query has done what you were expecting, including the number of records: there should hopefully be one for each data row! All records from the original data file should now be geo-referenced simultaneously by postcodes and by output area codes. Once confident that the correct result has been achieved, the output from this query can be modified to create a new table containing the linked data. Right click on the Query to open it again in Design view and from the Query menu choose "Make-Table Query...". Name the table as "linking_pcd_oa". Run the query again and it will create a new data table containing the matched data. You can now quit the query without saving.

Aggregating Data in Microsoft Access

Once an output area code has been assigned to each record, it is possible to aggregate the data to the output area level. Select Insert > Query and then choose the "Design View" option. In the Show Table dialog box, add the "linking_pcd_oa" table. Close the Show Table dialog box and drag and drop "Output Area" and "APScore" into the "Field:" row of the grid. Right-click anywhere of the grid and turn on the "Totals" option. A new "Total:" row will then be added to the grid. On this new row, click the "Group By" cell of the "APScore" column and select the "Avg" option. This option will aggregate all the values of the "APScore" within the same "Output Area" before determining the average "APScore" of that area. Close the Query window and save the results as " _aggoacode".

Regrouping data using OA code in Access

To examine the aggregated data, double-click on the " _aggoacode" query. The second column will now be renamed as "AvgOfAPScore" and the number of records will be reduced from 61 to 60 rows of data (as there are two schools lying in "00BBGR0028") , arranged by the unique area codes. Note that selecting the query and choosing "Design" opens the design view again and allows it to be modified, for example if it has not worked as intended or you wish to re-order the fields in the result.

The aggregated crimes records in Access

The query results can now be exported to an external file for further analysis using File > Export, for example in Microsoft Excel spreadsheet (.XLS), database (.DBF) or comma-delimited text file (.CSV) formats. These options would also allow the aggregated data to be transferred to a mapping package. In this example, export the query results in the CSV format and name the file as " _byoa.csv"

Copyright Attribution

Access™ and Excel® are registered trademarks of Microsoft Corporation. The software screen shots shown in this page are reprinted with permission from Microsoft Corporation.