Resource Category:
• Concepts
• Methods
• Datasets
• Examples

Geographical Referencing Learning Resources

Ecological Fallacy

The ecological fallacy applies to all aggregated datasets, not just those for geographical areas, but is particularly relevant to geographical analysis and mapping. It affects the aggregation of events, individuals or households and therefore apply to many variables of interest in the social sciences, including, for example, unemployment and ethnicity rates. If considered at the individual level, there will be a given level of association between non-white ethnicity and unemployment. This may be positive, negative or there may be no relationship. Without individual-level data on each variable we are unable to measure this association and so are usually forced to use areally aggregated data. Typically, data are aggregated from the source observations to areas such as census wards or output areas which have no special meaning in terms of the underlying geographical distributions such as unemployment or ethnic composition.

The ecological fallacy describes the fact that observing the associations between variables at a given level of aggregation does not indicate the level of association which may exist among individuals or at other aggregation scales. This can be illustrated with a 16-area example based on the same population counts as shown in Figure 1 of the modifiable area unit problem (MAUP) resource page:

Figure 1: Percentage non-white ethnic group in 16 areas

Figure 2: Percentage unemployment

Figure 3: Percentage unemployed, non-white aggregated to four areas

Figure 4: Percentage unemployed, non-white aggregated to four areas

Figure 1 shows the percentage non-white population for the same 16 areas as the percentage unemployment in Figure 2. The correlation between the values in figures 1 and 2 is r=0.06, indicating almost no association between the two variables at the 16-area level. However, when the two four-area aggregations are used, these values are grouped in Figure 3 to suggest an association of r=0.54 while Figure 4 suggests r=0.68. In reality, the apparent association is due primarily to the boundary configuration which leads to a particular aggregation structure. It cannot be determined from these aggregate data what the association may be at the individual level, or at any other level of aggregation. The ecological fallacy, as applied to area-based data, is closely related to the modifiable areal unit problem.