About this Resource
    Resource Category:
  • Concepts
  • Methods
  • Datasets
  • Examples

Geo-Refer logo Geographical Referencing Learning Resources

Correctly Formatting Postcodes

Most computer applications which use postcodes will require the codes to be formatted in a particular way in order to operate correctly. Unfortunately there are multiple ways of writing postcodes in common usage, including several common errors, which are readily interpretable to the human eye but which will cause automated matching of postcodes to fail.

"SO17 1BJ" is a full, correct, unit postcode. It contains eight characters altogether, one of which is a space separating the incode ("SO17")and outcode ("1BJ") parts.

"SO171BJ" is a variant of this code which is technically correct but which contains only seven characters, with no space. Both seven and eight character versions are widely used.

The following are all variants of the same postcode which are close enough to the correct version that they would probably be correctly delivered by the postal service. However, none of them is technically correct and none would automatically match the correct version using general purpose office or statistical software. They illustrate problems of incorrect spacing, inconsistent letter case ("J" and "j") and substitution of similar characters ("I" and "1", "O" and "0"). These types of error occur most frequently when the original data were entered by hand and have not been subsequently cleaned or validated.

  • "S0171BJ"
  • "SOI7 1BJ"
  • "SO17 IBJ"
  • "S0I7 IBJ"
  • "SO I7 Ibj"
  • "So17IBJ"
  • "S017 1BJ"
  • "SO1 71BJ"

For those with large quantities of data to process, there is a range of commercial services on offer whereby computer-readable postcode lists can be cleaned and correctly formatted using specialist software - this is primarily intended for those preparing large mailshots. The detailed rules for determining whether a UK postcode is valid are complex and require a checklist of all the characters which are valid in each position of the code. It is unlikely that the research user who does not have a repeated need to match large postcode lists will wish to become involved in this level of checking, which will also necessitate programming skills.

UK Postcode Patterns and Conversion Algorithms

If the user has a moderate level of confidence that a postcode list is already likely to be of reasonable quality and consistently formatted, it can be easiest to proceed with processing using software such as GeoConvert and then to attempt manual correction or removal of any postcodes which fail to be matched. Where "A" represents an alphabetic character, "N" represents a number and "_" represents a space, valid postcodes will always follow one of the following patterns:

Pattern Example (code) Example (place)
AN_NAA  B1 1AA  Royal Mail Central Birmingham Delivery Office
ANN_NAA  M60 2LA  Manchester City Council
AAN_NAA  SA6 7JL  Driver and Vehicle Licensing Authority, Swansea
AANN_NAA  SO17 1BJ  University of Southampton
ANA_NAA  W1D 1AN  Tottenham Court Road Tube Station, London
AANA_NAA  EC2R 8AH  Bank of England, London

It is possible for the research user to employ simple tools to overcome some of the most commonly encountered problems before attempting to process their postcodes. Standard spreadsheet functions can be used to change all letters to upper case, to search for spaces, check for simple substitutions and to pad or condense the postcode to a desired length. Before reformatting postcodes the user should check whether a seven or eight-character version is required by the intended analysis software.

Additional Resources

The ONS Beginner's Guide to UK Geography [http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/index.html] contains a useful section on postcode structure and georeferencing issues