The Spatial Nature of Social Science Data

The social sciences deal with many objects of study including individuals, families, households, jobs, events, organizations, journeys and networks. It is entirely possible to meaningfully study these phenomena aspatially - for example examining the relationships between individuals within a household - without any regard for their geographical location. However it is important to recognise that each of these phenomena is geographically situated: it has a spatial location. If we are able to record these locations then it becomes possible to undertake explicitly spatial analyses (for example to explore the way in which family structure varies in different neighbourhoods) or to use location as a means of linking otherwise disconnected data (for example to identify the local unemployment rate at the place of residence of each survey respondent). The spatial nature of social science data opens up potential avenues for research which are not possible if these data are treated aspatially, but it is necessary to understand the ways in which social science phenomena may be associated with spatial locations. The social sciences differ in important ways from the physical sciences where the spatial location of the object of study, such as a coastline or landfill site, can be directly surveyed in the form of a spatial coordinates. Georeferencing (short for geographical referencing) in the social sciences is usually indirect, by reference to some intermediate geography such as an address, postcode or administrative area.

Common Geographical Addresses

The first category of geographical references is that which relates to individual addresses. These include the residential or work addresses of survey respondents, the addresses of workplaces or other organizations and the addresses of locations at which events take place – for example shops and hospitals where services are delivered. Postal addresses such as ’23 Acacia Avenue, Coketown’ are widely used in common language and it is generally possible to assign spatial locations in the form of spatial coordinates either directly or via postal codes by matching published directories of locations.

A letter box on an entrance door

A road sign showing county name

A second major group of geographical references is that which relates to areas. These may be areas used for administrative, electoral or statistical purposes such as ‘ County of Wiltshire’. Sometimes these areas are used because no more precise location is known but more often because data at the individual level have been deliberately aggregated in order to preserve the confidentiality of census or survey respondents. A particular challenge with areal unit names is that their use in common language may not correspond exactly with their formal definition in administrative or statistical terms. Thus ‘London’ may be used to refer to built-up area, a collection of administrative boroughs, the area within an orbital motorway or an ill-defined urban area associated with particular landmarks. These may all be of interest to an urban sociologist although some will be more meaningful than others in terms of the lives of London residents. Geographical areas are formally defined by a descriptor (a textual name or alphanumerical code) and a set of boundary locations.

A third category of phenomena are those which are best described as linear features connecting two or more locations: examples include roads and paths, journeys to work or migration routes. More subtle examples include flows of commodities or communications links such as telephone calls. In common language these phenomena may be given identifiers such as 'the X7 bus route' or may be defined only by their start and end points such as a pair of home and work addresses.

A sheltered bus-stop

A street name sign

Many social science phenomena do not fit neatly into the above categories because they take place at spatial locations which are not readily described by indirect referencing systems such as addresses or areas. An example would be a theft from a car parked on the edge of a wood. This event may be one of a series of objects of criminological study which display interesting geographical patterns but which cannot be described except by complex textual description and for which it is not possible to identify specific grid references after the event. Other examples might include road accidents or environmental quality. In the latter case neighbourhoods will display different aesthetic characteristics potentially affecting quality of life, but which cannot be assigned to exact spatial coordinates.

A further category of phenomena to which it can be challenging to assign appropriate spatial locations is those which have multiple (or mobile) locations or can be recognised at multiple geographical scales. For example, the identity of a major retail chain and location of its registered office will not be an appropriate spatial reference for the place of employment of its many employees who work in local branches or routinely travel between sites. In these special cases, the most appropriate spatial location to be captured will depend on the specific purpose for which the researcher needs the information: the employment researcher and the business analyst may decide to treat this information in quite different ways.

A directory listing in an industrial estate