Html Version of SAS Code to run exemplar6 imputation from detailed questions
It is intended to show you the code and to allow links, not to use
as a SAS program. The SAS program is the file ex6det.sas which you should
save to a file and/or read in to SAS.
To see output from the commands go to the SAS results file.
This program gives the code for the models that seemed to work best.
For details of all the different models that were fitted before the
final MICE model was selected go to ex6DETprob.htm
Links in this page
Analysis with IVEWARE macro
Recalculating scores from imputed data
Postimputation procedures for IVEWARE results
Analysis with PROC MI
Postimputation procedures for PROC MI results
EVERYTHING INSIDE /* AND */ IS TAKEN AS A COMMENT in SAS programs
These are shown in green here
/*-----------------------------------------------------------------
Imputation SAS code for Exemplar 6
This program uses the IVEWARE macros that can be installed from
http://www.isr.umich.edu/src/smp/ive/
ALSO to run the IVEWARE software you need to close all advanced editor windows
and use only the old program editor
CHANGE the libname to shere your data sets are stored
proc contents data=ex6.ex6 short position;run;
---------------------------------------------------------------------------------*/
libname exemp6 "C:\Documents and Settings\gillian\My Documents\aprojects\peas\ex6datafiles\data' ;
/*--------------------------------------------------------------------------------
back to top
/*------------------------------------------------------------------
Now the analysis of detailed questions using the IVEWARE macros
details of what the commands mean are in the user guide on the IVE
site. Here all variables are treated as categorical and each has 8
categories.
--------------------------------------------------------------------*/
title 'Imputation of detailed questions with IVEWARE';
/*-------------------------------------------------------------------
There were many false starts before this model was found to work reasonably well
The file ex6probs.htm gives a summary of some of the things that can go wrong.
The code to test out these alternatives is in ex6probs.sas
The version that worked defined each of these categories as a Poisson variable.
-----------------------------------------------------------------------*/
%impute(name=ivesetupdet,
dir=C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\program_code
,setup=new);
title Multiple imputation prevalence only;
datain ex6.ex6det;
dataout impout all ;
default count;
bounds YAYARS02(<=7) YAYBOP02(<=7)YAYBUS02(<=7) YAYCBK02(<=7) YAYGRF02(<=7) YAYHBK02(<=7) YAYHIT02(<=7) YAYHOM02(<=7) YAYJRD02(<=7)
YAYROB02(<=7) YAYSCL02(<=7) YAYSHP02(<=7) YAYSKV02(<=7) YAYVND02(<=7) YAYWEP02(<=7) YBYARS02(<=7) YBYBOP02(<=7) YBYBUS02(<=7) YBYCBK02(<=7)
YBYGRF02(<=7) YBYHBK02(<=7) YBYHIT02(<=7) YBYHOM02(<=7) YBYJRD02(<=7) YBYPET02(<=7) YBYROB02(<=7) YBYSCL02(<=7)
YBYSHP02(<=7) YBYSKV02(<=7) YBYVND02(<=7) YBYWEP02(<=7) YCYARS02(<=7) YCYBOP02(<=7) YCYBUS02(<=7) YCYCBK02(<=7)
YCYDRG02(<=7) YCYGRF02(<=7) YCYHBK02(<=7) YCYHIT02(<=7) YCYHOM02(<=7) YCYJRD02(<=7) YCYPET02(<=7) YCYRAB02(<=7) YCYROB02(<=7)
YCYSCL02(<=7) YCYSHP02(<=7) YCYSKV02(<=7) YCYVND02(<=7) YCYWEP02(<=7) YDYARS02(<=7) YDYBOP02(<=7) YDYBUS02(<=7) YDYCBK02(<=7) YDYDRG02(<=7) YDYGRF02(<=7)
YDYHBK02(<=7) YDYHIT02(<=7) YDYHOM02(<=7) YDYJRD02(<=7) YDYPET02(<=7) YDYRAB02(<=7) YDYROB02(<=7)
YDYSCL02(<=7) YDYSHP02(<=7) YDYSKV02(<=7) YDYVND02(<=7) YDYWEP02(<=7) YEYARS02(<=7) YEYBOP02(<=7) YEYBUS02(<=7)
YEYCBK02(<=7) YEYDRG02(<=7) YEYGRF02(<=7) YEYHBK02(<=7) YEYHIT02(<=7) YEYJRD02(<=7) YEYPET02(<=7) YEYRAB02(<=7)
YEYROB02(<=7) YEYRST02(<=7) YEYSHP02(<=7) YEYSKV02(<=7) YEYVND02(<=7)
YEYWEP02(<=7) YFYARS02(<=7) YFYBFT02(<=7) YFYBOP02(<=7) YFYCBK02(<=7) YFYDRG02(<=7) YFYFRD02(<=7)
YFYHBK02(<=7) YFYHIT02(<=7) YFYJRD02(<=7) YFYPET02(<=7) YFYRAB02(<=7) YFYROB02(<=7) YFYRST02(<=7) YFYRST22(<=7)
YFYSHP02(<=7) YFYSKV02(<=7) YFYVND02(<=7) YFYWEP02(<=7);
iterations 5;
multiples 10;
transfer caseid ;
run ;
/*------------------ save it after all that trouble--------------------*/
data ex6.detailimp;
set detailimp;
run;
/*----------------------------------------------------------
Now use the macro to check results. First save to your disc and
run an include statement like this one
%include "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\program_code\checkimp_macro.sas";
-------------------------------------------------*/
%checkimp(ex6.detailimpmi,ex6.ex6det);
back to top
/*--------- now calculate prevalence etc from the imputed data----------*/
data ex6.fromdetail (keep=dprev: dvol: dvar: caseid _mult_);
set ex6.detailimp;
/*---------------------------- sweep 1-----------------------------*/
*Variety of delinquency at sweep one (n=15);
array offv1 yaybus02 yaybop02 yayshp02 yayjrd02 yayscl02 yaywep02
yaygrf02 yayrob02
yayvnd02 yayhbk02 yayhom02 yaycbk02 yayars02 yayhit02 yayskv02;
dprev1=0;
dvar1=0;
dvol1=0;
do i=1 to 15;
if offv1[i]>0 then do dprev1=1; dvar1=dvar1+1; end;
if offv1[i]=7 then offv1[i]=11; * code to match SPSS;
dvol1=offv1[i]+dvol1; ;
end;
/*---------------------------- sweep 2-----------------
and similar code to recalculate scores at the other sweeps
see file ex6det.sas for detailed code
-----------------------------------------------------------*/
run;
/*----now merge this with some variables from the original file---*/
data temp1;
set ex6.ex6;
keep caseid sector szindep;
run;
proc sort data=ex6.fromdetail; by caseid;run;
proc sort data=temp1; by caseid;run;
data temp2;
merge ex6.fromdetail temp1;
by caseid;
rename _Mult_ =_imputation_; * to use proc MI;
run;
back to top
/*--------------------------------------------------------
POST IMPUTATION PROCEDURES
The imputation variable needs to be called _imputation_ renamed above
These are always 2 stage procedures
------------------------------------------------------------*/
proc sort data=temp2; by _imputation_; </span>;
/*---------------------------- get means using corr---------*/
PROC CORR DATA=temp2 COV OUT=OUTCOV(TYPE=COV) NOCORR noprint ;
VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ;
BY _IMPUTATION_;
run;
PROC MIANALYZE data=outcov ;
VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ;
RUN;
/*------------ now logistic regression and MIANALYZE -----*/
proc logistic data=temp2 outest=outreg covout descending noprint;
by _imputation_;
class gender szindep sector;
model dprev6= gender szindep sector ;
run;
* ------------------use this to find names of contrasts;
proc contents data= outreg short;run;
PROC MIANALYZE data=outreg edf=4325 ;
* edf are the residual degrees of freedom from the model;
var GENDERFemale SZINDEPManual_high_depr
sectorBehavioural sectorIndependent sectorSpecial ; run;
/*--We can compare with the complete data analysis here---------*/
proc logistic data=ex6.ex6 descending ;
class gender szindep sector;
model dprev6= gender szindep sector ;
run;
/*---- now a check to compare prevalence by deprivation with original data---*/
proc means data=temp2;
class szindep;
var dprev6;
run;
proc means data=ex6.ex6;
class szindep;
var dprev6;
run;
back to top
/*------------------------------------------------------------
Now analyses with proc MI that assume all variables are
multivariate normal.
Upper bounds are set on all variables to prevent high values
But lower bounds are not set because experience showed that
this resulted in a downward bias that gave too few responses
in the zero category
Needs to be run in two halves as problem seems
to be too big otherwise
---------------------------------------------------------------*/
proc contents short data=ex6.ex6detail; * this just to get list of variables;
run;
data temp1;
set ex6.ex6detail;
keep yaybus02 ybybus02 ycybus02 ydybus02 yeybus02
yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02
yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02
yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02
yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02
yayhom02 ybyhom02 ycyhom02 ydyhom02
yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02
yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02
yayrob02 ybyrob02 gender;
run;
/*------------------------------------------------
now the imputation including rounding to the nearest whole
-------------------------------------------------------*/
proc mi data=temp1 OUT=imp1 nimpute=10 round=1;
VAR yaybus02 ybybus02 ycybus02 ydybus02 yeybus02
yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02
yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02
yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02
yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02
yayhom02 ybyhom02 ycyhom02 ydyhom02
yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02
yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02
yayrob02 ybyrob02
gender ;
MCMC timeplot(mean(yfybop02)) timeplot(mean(ydyhbk02)) timeplot(cov(yeyhbk02*yfyhbk02));
RUN;
/*---------- run this to check for extreme values--------*/
proc means data=imp1; run;
/*---- now pull values <0 or>7 into range------------*/
data imp1;
set imp1;
array vars caseid yaybus02 ybybus02 ycybus02 ydybus02 yeybus02
yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02
yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02
yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02
yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02
yayhom02 ybyhom02 ycyhom02 ydyhom02
yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02
yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02
yayrob02 ybyrob02
;
do over vars; if vars<0 then vars=0; if vars>7 then vars=7; end;
run;
/*-------------------------------------------------------------
now repeated for second lot of variables
code not shown here as just repeated
also missed out is the code to recalculate scores from
imputed data as it is identical to code above
Including checking the imputed values
-----------------------------------------------------------*/
back to top
/*-----------------------------------------------------------------------------------
POST IMPUTATION PROCEDURES
These are always 2 stage procedures
-------------------------------------------------------------------------------------*/
proc sort data=ex6.fromdetailmi; by _imputation_; run;
/*---------------------------- get means using corr--------------------------------*/
PROC CORR DATA=ex6.fromdetailmi COV OUT=OUTCOV(TYPE=COV) NOCORR noprint ;
VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ;
where gender=1; * boys only;
BY _IMPUTATION_;
run;
PROC MIANALYZE data=outcov ;
title 'MI detailed imputation boys';
VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ;
RUN;
PROC CORR DATA=ex6.fromdetailmi COV OUT=OUTCOV(TYPE=COV) NOCORR noprint ;
VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ;
where gender=2; * girls only;
BY _IMPUTATION_;
run;
PROC MIANALYZE data=outcov ;
title 'MI detailed imputation girls';
VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ;
RUN;
/*------------------- logistic regression and MIANALYZE---------------------------*/
proc logistic data=ex6.fromdetailmi outest=outreg covout descending noprint;
by _imputation_;
class gender szindep sector;
model dprev6= gender szindep sector ;
run;
proc contents data= outreg short;run;* use this to find names of contrasts;
PROC MIANALYZE data=outreg edf=4325 ; * these are the residual degrees of freedom from the model;
var GENDER1 SZINDEPManual_high_depr
sectorBehavioural sectorIndependent sectorSpecial ;
run;
/*--We can compare with the complete data analysis here---------*/
proc logistic data=ex6.ex6 descending ;
class gender szindep sector;
model dprev6= gender szindep sector ;
run;
/*---------------- now a check to compare prevalence by deprivation with original data---*/
proc means data=ex6.fromdetail;
class szindep;
var dprev6;
run;
proc means data=ex6.ex6;