Html Version of SAS Code to run exemplar6 imputation from detailed questions

It is intended to show you the code and to allow links, not to use as a SAS program. The SAS program is the file ex6det.sas which you should save to a file and/or read in to SAS. To see output from the commands go to the SAS results file. This program gives the code for the models that seemed to work best. For details of all the different models that were fitted before the final MICE model was selected go to ex6DETprob.htm Links in this page Analysis with IVEWARE macro Recalculating scores from imputed data Postimputation procedures for IVEWARE results Analysis with PROC MI Postimputation procedures for PROC MI results EVERYTHING INSIDE /* AND */ IS TAKEN AS A COMMENT in SAS programs These are shown in green here
/*----------------------------------------------------------------- Imputation SAS code for Exemplar 6 This program uses the IVEWARE macros that can be installed from http://www.isr.umich.edu/src/smp/ive/ ALSO to run the IVEWARE software you need to close all advanced editor windows and use only the old program editor CHANGE the libname to shere your data sets are stored proc contents data=ex6.ex6 short position;run; ---------------------------------------------------------------------------------*/ libname exemp6 "C:\Documents and Settings\gillian\My Documents\aprojects\peas\ex6datafiles\data' ; /*-------------------------------------------------------------------------------- back to top uparrow /*------------------------------------------------------------------ Now the analysis of detailed questions using the IVEWARE macros details of what the commands mean are in the user guide on the IVE site. Here all variables are treated as categorical and each has 8 categories. --------------------------------------------------------------------*/ title 'Imputation of detailed questions with IVEWARE'; /*------------------------------------------------------------------- There were many false starts before this model was found to work reasonably well The file ex6probs.htm gives a summary of some of the things that can go wrong. The code to test out these alternatives is in ex6probs.sas The version that worked defined each of these categories as a Poisson variable. -----------------------------------------------------------------------*/ %impute(name=ivesetupdet, dir=C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\program_code ,setup=new); title Multiple imputation prevalence only; datain ex6.ex6det; dataout impout all ; default count; bounds YAYARS02(<=7) YAYBOP02(<=7)YAYBUS02(<=7) YAYCBK02(<=7) YAYGRF02(<=7) YAYHBK02(<=7) YAYHIT02(<=7) YAYHOM02(<=7) YAYJRD02(<=7) YAYROB02(<=7) YAYSCL02(<=7) YAYSHP02(<=7) YAYSKV02(<=7) YAYVND02(<=7) YAYWEP02(<=7) YBYARS02(<=7) YBYBOP02(<=7) YBYBUS02(<=7) YBYCBK02(<=7) YBYGRF02(<=7) YBYHBK02(<=7) YBYHIT02(<=7) YBYHOM02(<=7) YBYJRD02(<=7) YBYPET02(<=7) YBYROB02(<=7) YBYSCL02(<=7) YBYSHP02(<=7) YBYSKV02(<=7) YBYVND02(<=7) YBYWEP02(<=7) YCYARS02(<=7) YCYBOP02(<=7) YCYBUS02(<=7) YCYCBK02(<=7) YCYDRG02(<=7) YCYGRF02(<=7) YCYHBK02(<=7) YCYHIT02(<=7) YCYHOM02(<=7) YCYJRD02(<=7) YCYPET02(<=7) YCYRAB02(<=7) YCYROB02(<=7) YCYSCL02(<=7) YCYSHP02(<=7) YCYSKV02(<=7) YCYVND02(<=7) YCYWEP02(<=7) YDYARS02(<=7) YDYBOP02(<=7) YDYBUS02(<=7) YDYCBK02(<=7) YDYDRG02(<=7) YDYGRF02(<=7) YDYHBK02(<=7) YDYHIT02(<=7) YDYHOM02(<=7) YDYJRD02(<=7) YDYPET02(<=7) YDYRAB02(<=7) YDYROB02(<=7) YDYSCL02(<=7) YDYSHP02(<=7) YDYSKV02(<=7) YDYVND02(<=7) YDYWEP02(<=7) YEYARS02(<=7) YEYBOP02(<=7) YEYBUS02(<=7) YEYCBK02(<=7) YEYDRG02(<=7) YEYGRF02(<=7) YEYHBK02(<=7) YEYHIT02(<=7) YEYJRD02(<=7) YEYPET02(<=7) YEYRAB02(<=7) YEYROB02(<=7) YEYRST02(<=7) YEYSHP02(<=7) YEYSKV02(<=7) YEYVND02(<=7) YEYWEP02(<=7) YFYARS02(<=7) YFYBFT02(<=7) YFYBOP02(<=7) YFYCBK02(<=7) YFYDRG02(<=7) YFYFRD02(<=7) YFYHBK02(<=7) YFYHIT02(<=7) YFYJRD02(<=7) YFYPET02(<=7) YFYRAB02(<=7) YFYROB02(<=7) YFYRST02(<=7) YFYRST22(<=7) YFYSHP02(<=7) YFYSKV02(<=7) YFYVND02(<=7) YFYWEP02(<=7); iterations 5; multiples 10; transfer caseid ; run ; /*------------------ save it after all that trouble--------------------*/ data ex6.detailimp; set detailimp; run; /*---------------------------------------------------------- Now use the macro to check results. First save to your disc and run an include statement like this one %include "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\program_code\checkimp_macro.sas";
-------------------------------------------------*/
%checkimp(ex6.detailimpmi,ex6.ex6det); back to top uparrow /*--------- now calculate prevalence etc from the imputed data----------*/ data ex6.fromdetail (keep=dprev: dvol: dvar: caseid _mult_); set ex6.detailimp; /*---------------------------- sweep 1-----------------------------*/ *Variety of delinquency at sweep one (n=15); array offv1 yaybus02 yaybop02 yayshp02 yayjrd02 yayscl02 yaywep02 yaygrf02 yayrob02 yayvnd02 yayhbk02 yayhom02 yaycbk02 yayars02 yayhit02 yayskv02; dprev1=0; dvar1=0; dvol1=0; do i=1 to 15; if offv1[i]>0 then do dprev1=1; dvar1=dvar1+1; end; if offv1[i]=7 then offv1[i]=11; * code to match SPSS; dvol1=offv1[i]+dvol1; ; end; /*---------------------------- sweep 2----------------- and similar code to recalculate scores at the other sweeps see file ex6det.sas for detailed code -----------------------------------------------------------*/ run; /*----now merge this with some variables from the original file---*/ data temp1; set ex6.ex6; keep caseid sector szindep; run; proc sort data=ex6.fromdetail; by caseid;run; proc sort data=temp1; by caseid;run; data temp2; merge ex6.fromdetail temp1; by caseid; rename _Mult_ =_imputation_; * to use proc MI; run; back to top uparrow /*-------------------------------------------------------- POST IMPUTATION PROCEDURES The imputation variable needs to be called _imputation_ renamed above These are always 2 stage procedures ------------------------------------------------------------*/ proc sort data=temp2; by _imputation_; </span>; /*---------------------------- get means using corr---------*/ PROC CORR DATA=temp2 COV OUT=OUTCOV(TYPE=COV) NOCORR noprint ; VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ; BY _IMPUTATION_; run; PROC MIANALYZE data=outcov ; VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ; RUN; /*------------ now logistic regression and MIANALYZE -----*/ proc logistic data=temp2 outest=outreg covout descending noprint; by _imputation_; class gender szindep sector; model dprev6= gender szindep sector ; run; * ------------------use this to find names of contrasts; proc contents data= outreg short;run; PROC MIANALYZE data=outreg edf=4325 ; * edf are the residual degrees of freedom from the model; var GENDERFemale SZINDEPManual_high_depr sectorBehavioural sectorIndependent sectorSpecial ; run; /*--We can compare with the complete data analysis here---------*/ proc logistic data=ex6.ex6 descending ; class gender szindep sector; model dprev6= gender szindep sector ; run; /*---- now a check to compare prevalence by deprivation with original data---*/ proc means data=temp2; class szindep; var dprev6; run; proc means data=ex6.ex6; class szindep; var dprev6; run; back to top uparrow
/*------------------------------------------------------------ Now analyses with proc MI that assume all variables are multivariate normal. Upper bounds are set on all variables to prevent high values But lower bounds are not set because experience showed that this resulted in a downward bias that gave too few responses in the zero category Needs to be run in two halves as problem seems to be too big otherwise ---------------------------------------------------------------*/ proc contents short data=ex6.ex6detail; * this just to get list of variables; run; data temp1; set ex6.ex6detail; keep yaybus02 ybybus02 ycybus02 ydybus02 yeybus02 yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02 yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02 yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02 yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02 yayhom02 ybyhom02 ycyhom02 ydyhom02 yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02 yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02 yayrob02 ybyrob02 gender; run; /*------------------------------------------------ now the imputation including rounding to the nearest whole -------------------------------------------------------*/ proc mi data=temp1 OUT=imp1 nimpute=10 round=1; VAR yaybus02 ybybus02 ycybus02 ydybus02 yeybus02 yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02 yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02 yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02 yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02 yayhom02 ybyhom02 ycyhom02 ydyhom02 yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02 yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02 yayrob02 ybyrob02 gender ; MCMC timeplot(mean(yfybop02)) timeplot(mean(ydyhbk02)) timeplot(cov(yeyhbk02*yfyhbk02)); RUN; /*---------- run this to check for extreme values--------*/ proc means data=imp1; run; /*---- now pull values <0 or>7 into range------------*/ data imp1; set imp1; array vars caseid yaybus02 ybybus02 ycybus02 ydybus02 yeybus02 yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02 yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02 yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02 yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02 yayhom02 ybyhom02 ycyhom02 ydyhom02 yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02 yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02 yayrob02 ybyrob02 ; do over vars; if vars<0 then vars=0; if vars>7 then vars=7; end; run; /*------------------------------------------------------------- now repeated for second lot of variables code not shown here as just repeated also missed out is the code to recalculate scores from imputed data as it is identical to code above Including checking the imputed values -----------------------------------------------------------*/ back to top uparrow /*----------------------------------------------------------------------------------- POST IMPUTATION PROCEDURES These are always 2 stage procedures -------------------------------------------------------------------------------------*/ proc sort data=ex6.fromdetailmi; by _imputation_; run; /*---------------------------- get means using corr--------------------------------*/ PROC CORR DATA=ex6.fromdetailmi COV OUT=OUTCOV(TYPE=COV) NOCORR noprint ; VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ; where gender=1; * boys only; BY _IMPUTATION_; run; PROC MIANALYZE data=outcov ; title 'MI detailed imputation boys'; VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ; RUN; PROC CORR DATA=ex6.fromdetailmi COV OUT=OUTCOV(TYPE=COV) NOCORR noprint ; VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ; where gender=2; * girls only; BY _IMPUTATION_; run; PROC MIANALYZE data=outcov ; title 'MI detailed imputation girls'; VAR dprev1-dprev6 dvol1-dvol6 dvar1-dvar6 ; RUN; /*------------------- logistic regression and MIANALYZE---------------------------*/ proc logistic data=ex6.fromdetailmi outest=outreg covout descending noprint; by _imputation_; class gender szindep sector; model dprev6= gender szindep sector ; run; proc contents data= outreg short;run;* use this to find names of contrasts; PROC MIANALYZE data=outreg edf=4325 ; * these are the residual degrees of freedom from the model; var GENDER1 SZINDEPManual_high_depr sectorBehavioural sectorIndependent sectorSpecial ; run; /*--We can compare with the complete data analysis here---------*/ proc logistic data=ex6.ex6 descending ; class gender szindep sector; model dprev6= gender szindep sector ; run; /*---------------- now a check to compare prevalence by deprivation with original data---*/ proc means data=ex6.fromdetail; class szindep; var dprev6; run; proc means data=ex6.ex6;