About this Resource
Go to: [ ESRC ] [ RDI ] [ NCRM ]

Geo-Refer logo GEO-REFER Workshop 1: An Introduction to Geographical Referencing

Sample Dataset: QOF (Quality and Outcomes Framework) Information

The data files provided are real and thus include entirely realistic format and quality challenges! The exercise is to identify some aspects of GP service performance and relate them to different geographical areas such as census output areas. Ideally, these will then be mapped. There is no 'right' answer and you are encouraged to treat these datasets in ways which most closely reflect the types of analysis which you might yourself wish to perform on your own data.

Download Data

Two data files are suggested for this exercise:.

  1. QOF0405_Practices_Lookup.xls - This file can be downloaded from http://content.digital.nhs.uk/catalogue/PUB01946/qof-eng-04-05-gp-code-tab-anx.xls. You can click [here] to download the file.
  2. This file contains a lookup table between all the GPs recorded in the QOF dataset and their address and postcode. NB Practices in this context are groups of doctors, not usually individual doctors.
  3. QOF0405_Practices_DomainSummary_Southern.xls - This file can be downloaded from http://www.ic.nhs.uk/statistics-and-data-collections/supporting-information/audits-and-performance/the-quality-and-outcomes-framework/qof-2004/05/qof-2004/05-detailed-spreadsheets/qof-2004-05-gp-tables by selecting 'Southern England' from 'Select a region' and 'Domain Summary' under 'Overall QOF'. You may also click [here] to download the file.
  4. This file contains summary 2004/5 Quality and Outcomes Framework data at general practice (GP) level for all practices in the South. The QOF data have all been downloaded from http://www.ic.nhs.uk/statistics-and-data-collections/supporting-information/audits-and-performance/the-quality-and-outcomes-framework/qof-2004/05/qof-2004-05-detailed-spreadsheets which also contains all the necessary documentation and background. Essentially this file contains the quality points achieved across a range of 'domains' intended to measure primary health care performance. Each domain such as 'Disease' or 'Organization' is a summary of a set of more detailed scores that can be separately downloaded from the website.

Firstly, it will be necessary to associate each of the GP data with their postcodes. Each of the input files will require some manipulation in order to remove the extraneous header lines and unnecessary columns.

QOF0405_Practices_Lookup.xls should be opened in Excel and reduced to the column headings row together with the columns containing the practice code and postcode. We recommend that you create a new Excel file containing just the data which you need to work with. Examining the postcodes in this file will reveal that they are not recorded in a common format: because a space has always been inserted in the middle of the postcode the overall length varies, hence 'TS1 3BE' is 7 characters long while 'TS18 2AT' is 8 characters long. This will mean that these postcodes would not all match successfully against any preformatted list of postcodes such as those found in a postcode directory. For the purposes of this exercise it is possible to create a new postcode column to the right of the existing data. If cell H5 contains the current postcode, Excel can be used to 'tidy up' these postcodes by using the expression =LEFT(H5,4)&(RIGHT(H5,3)). This expression can be copied all the way down a column in order to create a new column of postcodes which always contain 7 characters, with a space only used to pad out the length of a code where the first part is only 3 characters long. Note that when dealing with a postcode list for the entire UK, a more complex expression would be needed as there are valid examples of postcodes that may not be trapped by this simple fix.

From QOF0405_Practices_DomainSummary_Southern.xls we only wish to work with the data for 'Hampshire and Isle of Wight'. If your research required combining multiple strategic health authority areas, then the data would need to be extracted from each spreadsheet and appended in some way. You will need to extract just the practice codes and outcome scores that you are interested in, remove unnecessary header lines and again create a new Excel file containing just the data of interest. It is recommended that the long column headings be amended to something much shorter because Access will automatically (and unhelpfully) rename them. For example the Percentage Disease points could be renamed 'DisPer' etc.

Linking QOF data with postcodes using Access

From this point on, we can work to link the datasets using Access.

Microsoft Access should be opened and a new empty database created. Both of the edited Excel files (i.e. the two spreadsheets you generated in the last section one containing the practice code and tidied postcode and the other the practice domain summary of Hants and Isle of Wight . In case of any problem, here are the two files Practices_Lookup_edit1.xls [click here to download the file] and HantsIOWsummaries_edit1.xls [Click here to download the file] for you to continue the exercise) should then be read in, using the 'Import external data' wizard. Once successfully imported, a new Query needs to be constructed to link the two datasets together using the Practice Code field that appears in each, so that there is a means of linking the QOF data with the postcodes. Note carefully how many records there are in each table (this is displayed at the bottom of the window when viewing a table).

Open a new query using the Design view and add both tables. When first added to the query, these tables will by default be linked by the Access ID but this is not meaningful as the rows in the tables do not correspond. This link may be deleted by right-clicking on it, and a new link created by dragging to connect the Practice code field in one table with the Practice code field other. Fields from each table can be dragged and dropped into the output grid individually, or dragging and dropping the * symbol will move all fields in the table at once. Include all the fields from the QOF summary table and just the postcode from the lookup table. When closing the Query you will be prompted to save it, which you should do, choosing a meaningful name.

Double clicking on the saved Query will run it, displaying the result of linking the two tabular datasets by their common field, the Practice Code. We can now see the QOF summary data displayed in relation to the postcodes.

Check that the Query has done what you were expecting, including the number of records: there should hopefully be one for each data row! Once confident that the correct result has been achieved, this Query can be modified to create a new table containing the linked data. Right click on the Query to open it again in Design view and from the Query menu choose 'Make-table query'. Save and run the query again and it will create a new data table containing the matched data.

Other avenues to explore

At this point, the new table could be exported for processing by a data linkage tool such as GeoConvert, although some editing would be required in order to ensure that the postcode was the first field in each row. This approach is helpful for one-off matching situations. Alternatively, a postcode directory file could be imported to Access and the data matching undertaken within Access, which is perhaps the preferable approach where the same dataset is to be used repeatedly or variations on the same matching operation may be required. Be aware that all postcode directory files for extensive areas will be very large, especially when held in the .mdb format used by Access.

A new table can be imported from an existing Access database file. If a postcode directory is available in this format, this is preferable to importing a new table which will require identification and checking of all the field names. The PCD field of the National Statistics Postcode Directory contains postcodes formatted to 7 characters' length, corresponding to the postcode format applied to these data in Excel. It is therefore possible to create a new Query using the tools already identified to match the postcoded QOF data to selected fields from the postcode directory. These may be additional area codes, if it is intended to allocate the data to a different area type, or Ordnance Survey grid references if the GP locations are to be mapped or subjected to further spatial analysis. In the standard National Statistics Postcode Directory, the field names for Ordnance Survey Grid References are OSEAST (Easting) and OSNORTH (Northing). OACODE contains the 2001 census output area code. All codes are described in the relevant documentation downloaded with the files from the UKBORDERS service.

Tasks that you might undertake using these datasets:

  1. Map the locations of the GP practices using the grid-references that you obtain from the National Statistics Postcode Directory.
  2. Download (from UKBORDERS) and then map the boundaries of all of the Super Output Areas within the Isle of Wight. Note that the Isle of Wight is a Unitary Authority. Unitary Authority and County datasets are frequently indexed under separate categories in various data access sites, including UKBORDERS, which can cause confusion!
  3. Try counting the number of GPs per SOA. Or you could calculate and map the average QOF percentage per SOA.

Copyright Attribution

Access™ and Excel® are registered trademarks of Microsoft Corporation.