Exemplar 6 - Preprocessing

The programs below are provided to allow you to see how the data were extracted from the large survey files and made ready for analysis by different packages. It is not expected that you would want to do this again, but you may find it helpful for doing similar things yourself.

The data analysed in exemplar are taken from the Six sweeps of the Edinburgh Study of Youth Transitions and Crime (ESYTC). The data for sweeps 1 to 4 of this exemplar are available from the ESRC  data archive, where detailed documentation and codebooks can be accessed. The data used for this exemplar were obtained directly from the survey team who have kindly provided a file for longitudinal analysis. The majority of variables in this longitudunal file come from the Young People's questionnaires that can be viewed at the ESYTC web site , but some come from administrative data, such as having a school record of truancy.

 The program to extract and reorganise the data (originally supplied in SPSS) was written in SAS and is: ex6_prep.sas   It produces a SAS data set ex6.sas7bdat, and a transport file to read into other systems (ex6.xpt).

The following steps were used to read the SAS data sets into other packages :-

The data sets for analysis can all be accessed from the main page of exemplar 6.