/*---------------------------------------------------
 HOW TO USE THIS FILE
 This is an HTML version of the Stata do file ex6det.do
 It is intended to show you the code and to allow links not to use interactively
 Pasting the red stuff (Stata commands) into Stata might work, but it might not

 EVERYTHING INSIDE (STAR SLASH)  AND (SLASH STAR)
 IS TAKEN AS A COMMENT These are shown in black here
---------------------------------------------------------*/
Analysis with MICE package
Recalculate scores 
Postimputation procedures 


/*----------------------------------------------------
first read in the file, but before this increase the memory
  available to Stata because this exemplar needs it.
  this command with your file name will do it or use the menus
  You must increased the memory before reading the data.

Change the file name to where you have the data stored. --------------------------------------------------------------*/ clear set memory 900m use "C:\Documents and Settings\gillian raab\My documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear back to top uparrow /*--------------------------------------------------------------------------------------------------------- the imputation method we will use is the implementation of mice for Stata by patrick royston. this is also an add-on package. to install it, do a web search net resources for mice (in capital letters)and install the package. you can then use and get help for the commands mvis - to do the imputation micombine - for analysing multiple imputations mijoin misplit A NEW VERSION OF THIS WILL BE AVAILABLE SHORTLY WITH mvis RENAMED AS ice the results will be saved in a data set called prevonlyimp.dta. the m(4) subcommand gets 4 sets of imputed data. genmiss(miss) creates new variables to show which ones have been imputed. seed sets the seed for the random number so that we can get exactly the same results again the results will be saved in a data set called prevonlyimp.dta. the m(4) subcommand gets 4 sets of imputed data. genmiss(miss) creates new variables to show which ones have been imputed. seed sets the seed for the random number so that we can get exactly the same results again We have selected the ordered logit option for all 102 questions and include all the variables in the model. IT IS IMPORTANT TO USE THE DRAW OPTION HERE SINCE THERE IS A PROBLEM WITH THE DEFAULT METHOD OF PREDICTIVE MEAN MATCHING THE NEW VERSION WILL HAVE DRAW AS THE DEFAULT - SO CHECK THE HELP ----------------------------------------------------------------------------------------------------------*/ #delimit; ice2 yfyfrd02 ycyrab02 yaybus02 yayshp02 yaybop02 yayjrd02 yayscl02 yaywep02 yayvnd02 yayhbk02 yaygrf02 yayrob02 yayhom02 yayars02 yayhit02 yaycbk02 yayskv02 ybybus02 ybyshp02 ybybop02 ybyjrd02 ybyscl02 ybywep02 ybyvnd02 ybyhbk02 ybygrf02 ybyrob02 ybyhom02 ybyars02 ybyhit02 ybycbk02 ybyskv02 ybypet02 ycybus02 ycyshp02 ycybop02 ycyjrd02 ycyscl02 ycywep02 ycyvnd02 ycyhbk02 ycygrf02 ycyrob02 ycyhom02 ycyars02 ycyhit02 ycycbk02 ycyskv02 ycypet02 ycydrg02 ydybus02 ydyshp02 ydybop02 ydyjrd02 ydyscl02 ydywep02 ydyvnd02 ydyhbk02 ydygrf02 ydyrob02 ydyhom02 ydyars02 ydyhit02 ydycbk02 ydyskv02 ydypet02 ydydrg02 ydyrab02 yeybus02 yeyshp02 yeybop02 yeyjrd02 yeywep02 yeyvnd02 yeyhbk02 yeygrf02 yeyrob02 yeyars02 yeyhit02 yeycbk02 yeyskv02 yeypet02 yeydrg02 yeyrab02 yeyrst02 yfyshp02 yfybop02 yfyjrd02 yfywep02 yfyvnd02 yfyhbk02 yfyrob02 yfyars02 yfyhit02 yfycbk02 yfyskv02 yfypet02 yfydrg02 yfyrab02 yfyrst02 yfyrst22 yfybft02 yfyfrd02 gender , using imp1 , seed(812736) m(10) cmd(ologit) replace ; #delimit cr /************************************************************************ This failed as models too big and had convergence problems. So break up into smaller sets of variables then the variables are imputed from the other ones in their group. There are a total of 9 such groups, same procedure for each group. Gender (no missing data) is included along with each one. use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /*----------- first save just these variables-----------*/ #delimit; keep caseid yaybus02 ybybus02 ycybus02 ydybus02 yeybus02 yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02 yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02 gender; /*------------ then impute-----------------------------*/ #delimit; ice2 yaybus02 ybybus02 ycybus02 ydybus02 yeybus02 yayskv02 ybyskv02 ycyskv02 ydyskv02 yeyskv02 yfyskv02 yeyrst02 yfyrst02 yfyrst22 yfybft02 yfyfrd02 gender using imp1 , seed(812736) m(10) cmd(ologit) genmiss(m) replace ; #delimit cr /*-------------------- some sample plots for observed and imputed data------*/ histogram yfyskv02, discrete percent gap(50) bfcolor(red) blcolor(red) xtitle(Skiving school sweep 6)by(myfyskv02, legend(off)) use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /************************************************************************/ #delimit ; keep caseid yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02 yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02 yayhom02 ybyhom02 ycyhom02 ydyhom02 gender; #delimit; ice2 yayshp02 ybyshp02 ycyshp02 ydyshp02 yeyshp02 yfyshp02 yayscl02 ybyscl02 ycyscl02 ydyscl02 yayhom02 ybyhom02 ycyhom02 ydyhom02 gender using imp2 , seed(67736) m(10) genmiss(m) cmd(ologit) replace; #delimit cr /*-------------------- some sample plots for observed and imputed data------*/ histogram yfyskv02, discrete percent gap(50) bfcolor(red) blcolor(red) xtitle(Skiving school sweep 6)by(myfyskv02, legend(off)) use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /**********************************************************************/ #delimit; keep caseid yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02 yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02 yayrob02 ybyrob02 gender; #delimit; ice2 yaycbk02 ybycbk02 ycycbk02 ydycbk02 yeycbk02 yfycbk02 yayjrd02 ybyjrd02 ycyjrd02 ydyjrd02 yeyjrd02 yfyjrd02 yayrob02 ybyrob02 gender using imp3,draw seed(899736) m(10) genmiss(m) cmd(ologit) replace; #delimit cr /*-------------------- some sample plots for observed and imputed data------*/ use imp3,clear histogram ydycbk02, discrete percent gap(50) bfcolor(red) blcolor(red) xtitle(Car breaking sweep 4)by(mydycbk02, legend(off)) use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /*****************************************************************/ #delimit; keep caseid yaywep02 ybywep02 ycywep02 ydywep02 yeywep02 yfywep02 ycydrg02 ydydrg02 yeydrg02 yfydrg02 gender yayhit02 ybyhit02 ycyhit02 ydyhit02 yeyhit02 yfyhit02 ; #delimit; ice2 yaywep02 ybywep02 ycywep02 ydywep02 yeywep02 yfywep02 ycydrg02 ydydrg02 yeydrg02 yfydrg02 gender yayhit02 ybyhit02 ycyhit02 ydyhit02 yeyhit02 yfyhit02 using imp4 , seed(87646) m(10) genmiss(m) cmd(ologit) replace; #delimit cr use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /***********************************************************************/ use imp4,clear histogram yaywep02, discrete percent gap(50) bfcolor(red) blcolor(red) xtitle(Weapon carrying sweep 1)by(myaywep02, legend(off)) replace yaywep02=round(yaywep02) replace yaywep02=0 if yaywep02<0 #delimit; keep caseid yayvnd02 ybyvnd02 ycyvnd02 ydyvnd02 yeyvnd02 yfyvnd02 yaygrf02 ybygrf02 ycygrf02 ydygrf02 yeygrf02 gender yayars02 ybyars02 ycyars02 ydyars02 yeyars02 yfyars02 ; #delimit; ice2 yayvnd02 ybyvnd02 ycyvnd02 ydyvnd02 yeyvnd02 yfyvnd02 yaygrf02 ybygrf02 ycygrf02 ydygrf02 yeygrf02 gender yayars02 ybyars02 ycyars02 ydyars02 yeyars02 yfyars02 using imp5 , seed(99806) m(10) genmiss(m) cmd(ologit) replace; #delimit cr /*===================== check imputations graphically--------------*/ use imp5,clear histogram yayars02, discrete percent gap(50) bfcolor(red) blcolor(red) by(myayars02, legend(off)) use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /*************************************************************************/ #delimit; keep caseid ycyrab02 ydyrab02 yeyrab02 yfyrab02 gender; #delimit; ice2 ycyrab02 ydyrab02 yeyrab02 yfyrab02 gender using imp6 , seed(8146) m(10) genmiss(m) cmd(ologit) replace; #delimit cr /*===================== check imputations graphically--------------*/ use imp6,clear histogram ycyrab02, discrete percent gap(50) bfcolor(red) blcolor(red) by(mycyrab02, legend(off)) use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /*************************************************************************/ #delimit; keep caseid gender yayhbk02 ybyhbk02 ycyhbk02 ydyhbk02 yeyhbk02 yfyhbk02 ycyrob02 ydyrob02 yeyrob02 yfyrob02 ; #delimit; ice2 yayhbk02 ybyhbk02 ycyhbk02 ydyhbk02 yeyhbk02 yfyhbk02 ycyrob02 ydyrob02 yeyrob02 yfyrob02 gender using imp7 , seed(2246) m(10) genmiss(m) cmd(ologit) replace; #delimit cr use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /*===================== check imputations graphically--------------*/ use imp7,clear histogram yeyrob02, discrete percent gap(50) bfcolor(red) blcolor(red) by(myeyrob02, legend(off)) /***********************************************************************/ #delimit; keep caseid ybypet02 ycypet02 ydypet02 yeypet02 yfypet02 gender; #delimit; ice2 ybypet02 ycypet02 ydypet02 yeypet02 yfypet02 gender using imp8 , seed(8146) m(10) genmiss(m) cmd(ologit) replace; #delimit cr /*===================== check imputations graphically--------------*/ use imp8,clear histogram yeypet02, discrete percent gap(50) bfcolor(red) blcolor(red) by(myeypet02, legend(off)) use "C:\Documents and Settings\gillian raab\My Documents\aprojects\peaslaptop\ex6datafiles\data\ex6det.dta", clear /****************** add in extra variables for modelling ****************************************************/ #delimit; keep caseid yaybop02 ybybop02 ycybop02 ydybop02 yeybop02 yfybop02 gender szindep sector; #delimit; ice2 yaybop02 ybybop02 ycybop02 ydybop02 yeybop02 yfybop02 gender using imp9 , seed(43146) m(10) genmiss(m) cmd(ologit) replace; /*-------------------- some sample plots for observed and imputed data------*/ use imp9 histogram yfybop02, discrete percent gap(50) bfcolor(red) blcolor(red) xtitle(Variety sweep 1)by(myfybop02, legend(off)) /************************************************************************ Now merge them together ************************************************************************/ use imp1,clear merge using imp2 drop _merge merge using imp3 drop _merge merge using imp4 drop _merge merge using imp5 drop _merge merge using imp6 drop _merge merge using imp7 drop _merge merge using imp8 drop _merge merge using imp9 drop _merge save impdetail,replace ************************************************************************/ back to top uparrow /*-------- now code to calculate scores-------------------------------*/ /*---------------------------- sweep 1---------------------------------------*/ #delimit; recode yaybus02 yaybop02 yayshp02 yayjrd02 yayscl02 yaywep02 yaygrf02 yayrob02 yayvnd02 yayhbk02 yayhom02 yaycbk02 yayars02 yayhit02 yayskv02 (7=11); gen dvol1= yaybus02+yaybop02+yayshp02+yayjrd02+yayscl02+yaywep02+yaygrf02+yayrob02 + yayvnd02+yayhbk02+yayhom02+yaycbk02+yayars02+yayhit02+yayskv02; recode yaybus02 yaybop02 yayshp02 yayjrd02 yayscl02 yaywep02 yaygrf02 yayrob02 yayvnd02 yayhbk02 yayhom02 yaycbk02 yayars02 yayhit02 yayskv02 (1 2 3 4 5 6 11=1); gen dprev1=0;

replace dprev1=1 if dvar1>0;

drop yaybus02 yaybop02 yayshp02 yayjrd02 yayscl02 yaywep02 yaygrf02 yayrob02
yayvnd02 yayhbk02 yayhom02 yaycbk02 yayars02 yayhit02 yayskv02 ;

/*-------------------------and the same for the other sweeps--------------*/

back to top uparrow /*------------------------------------------------------------------------------------------------------------ now look at the results. you will have 10 sets of data one following the other with missing observations replaced and new variables starting with miss to indicate missingness At the end of the file you have two new variables (_i and _j by default). _i gives the sequence number in the original file _j is 1 for the first imputation and 2 for the second and so on. ----------------------------------------------------------------------*/ /*-------------------- examine contents---------------------------------------------------------*/ histogram dvol1, percent bfcolor(red) blcolor(red) xtitle(Volume sweep 1)by(missdvol1, legend(off)) tabulate missdvar6 dvar6, row bysort _j: tabulate missdvar6 dprev6, row #delimit; /*------------------------------------------------------------------ Now post imputation with micombine Use regress with no constant to get means and se by gender of DPREV6 You need to set up dummy variables with tab first -------------------------------------------------------------------------------------------------------------*/ tab gender, gen(dgen) micombine regress dprev1 dgen1 dgen2, noconstant micombine regress dprev6 dgen1 dgen2, noconstant micombine regress dvol1 dgen1 dgen2, noconstant micombine regress dvol6 dgen1 dgen2, noconstant micombine regress dvar1 dgen1 dgen2, noconstant micombine regress dvar6 dgen1 dgen2, noconstant /*----------------------------------------------------------------------------------------------------------- Then a combined logistic regression incorporating the difference between the imputations into the standard errors and compare this with the observed data only -------------------------------------------------------------------------------------------------------------*/ tabulate sector, gen( secd) sort _j summarize dprev6 gender szindep secd2 secd3 secd4 micombine logistic dprev6 gender szindep secd2 secd3 secd4 micombine logistic dprev6 szindep gender sort szindep by szindep: summarize dprev6