This exemplar is about estimating the
income distribution of households in Scotland and was
motivated by work commissioned by the Communities Scotland. It is based on
interviews carried out in Scotland as part of theFamily
The Family Resources Survey (FRS) data for
this exemplar were obtained from the ESRC Essex
Data Archive and originally consisted of the FRS data for
the financial year 2002/2003. The data differ from that from
the data archive because of steps taken to prevent any
disclosure of individuals. The principles we have followed in
anonymising the data can be viewed here and the details of what have
been done in this case are explained here .
The analyses of the anonymised data provides similar results
to what would have been obtained from the original data.
Our analyses also depart, to some extent,
from that carried out by the FRS team. We have
post-stratified the data to match Scottish totals. The FRS
team carried out a similar post-stratification , but this was
at a UK level.
We have prepared small data sets (seeTable 1.2 and the data
codebook) with only the variables you need for each
package. To find out how the variables relate to those on the
In this exemplar we illustrate how to rake
a survey so that it matches more than one set of population
totals. The packages Stata and R can do this and the SAS
macro CALMAR can be used.
We illustrate each of these. Once the
survey has been post-stratified the calculation of standard
errors should allow for the advantage in precision that the
post-stratification should have given. This requires extra
methods of analysis. Both Stata and R can do this by replication methods. The R
survey package can do it by a calibration method.