* SPSS missing value analysis for exemplar 6. * Gillian raab June 05. GET FILE='G:\peas\ex6datafiles\data\ex6.sav'. formats dprev1 dvar1 dvol1 dprev2 dvar2 dvol2 dprev3 dvar3 dvol3 dprev4 dvar4 dvol4 dprev5 dvar5 dvol5 dprev6 dvar6 dvol6 (f5.3). *--------------------------------means for complete cases----------------------------------------------. TABLES /FORMAT BLANK MISSING('.') /OBSERVATION dprev1 dvar1 dvol1 dprev2 dvar2 dvol2 dprev3 dvar3 dvol3 dprev4 dvar4 dvol4 dprev5 dvar5 dvol5 dprev6 dvar6 dvol6 /TABLES (dprev1+dvar1+dvol1+dprev2+dvar2+dvol2+dprev3+dvar3+dvol3+dprev4+dvar4+dvol4+dprev5+dvar5+dvol5+dprev6+dvar6+dvol6) BY GENDER . * Use Missing Value Analysis to get % missing. * and missing variable patterns. * More patterns with smaller numbers of cases can be viewed by changing the value of PERCENT to exclude a smaller number. * and impute missing data with EM and save to new file. MVA SASMKY80 SBSMKY80 SCSMKY80 SDSMKY80 SESMKY80 SFSMKY80 RAALCY80 RBALCY80 RCALCY80 RDALCY80 REALCY80 RFALCY80 GENDER ETHGP SAAGEGP HZRESPRE SZINDEP SZABS04 HZEVREFI YZLEAVE sector dprev1 dvar1 dprev2 dvar2 dprev3 dvar3 dprev4 dvar4 dprev5 dvar5 dprev6 dvar6 drgcode1 drgcode2 drgcode3 drgcode4 drgcode5 drgcode6 dvol1 dvol2 dvol3 dvol4 dvol5 dvol6 /TPATTERN PERCENT=1 /EM ( TOLERANCE=0.001 CONVERGENCE=0.0001 ITERATIONS=25 OUTFILE='G:\peas\ex6datafiles\data\ex6inpSPSS.sav' ) . *----------------------- and get the imputed data--------. GET FILE='G:\peas\ex6datafiles\data\ex6inpSPSS.sav'. *Define Variable Properties. * In order to examine the codes for the missing data. . FORMATS SASMKY80 SBSMKY80 SCSMKY80 SDSMKY80 SESMKY80 SFSMKY80 RAALCY80 RBALCY80 RCALCY80 RDALCY80 REALCY80 RFALCY80 GENDER ETHGP SAAGEGP HZRESPRE SZINDEP SZABS04 HZEVREFI YZLEAVE sector dprev1 dvar1 dprev2 dvar2 dprev3 dvar3 dprev4 dvar4 dprev5 dvar5 dprev6 dvar6 drgcode1 drgcode2 drgcode3 drgcode4 drgcode5 drgcode6 dvol1 dvol2 dvol3 dvol4 dvol5 dvol6 (F5.3). EXECUTE. * You can now look at data file to see how values imputed. GRAPH /HISTOGRAM=dvar6 . * now replace values of variables with nearest integer and pull in extreme values. DO REPEAT v= SASMKY80 SBSMKY80 SCSMKY80 SDSMKY80 SESMKY80 SFSMKY80 RAALCY80 RBALCY80 RCALCY80 RDALCY80 REALCY80 . COMPUTE v=RND(v). FORMATS v(F1.0). if (v<0) v=0. if v>4 v=4. END REPEAT PRINT. EXECUTE. * Note the print on the execute statement shows the commands that are run by this do loop. DO REPEAT v= RFALCY80 . COMPUTE v=RND(v). FORMATS v(F1.0). if (v<0) v=0. if v>5 v=5. END REPEAT . EXECUTE. DO REPEAT v= dprev1 dprev2 dprev3 dprev4 dprev5 dprev6 . COMPUTE v=RND(v). FORMATS v(F1.0). if (v<0) v=0. if (v>1) v=1. END REPEAT . EXECUTE. DO REPEAT v= dvar1 dvar2 dvar3 dvar4 dvar5 dvar6 . COMPUTE v=RND(v). FORMATS v(F1.0). if (v<0) v=0. END REPEAT . EXECUTE. DO REPEAT v= dvol1 dvol2 dvol3 dvol4 dvol5 dvol6 . COMPUTE v=RND(v). FORMATS v(F1.0). if (v<0) v=0. END REPEAT . EXECUTE. DO REPEAT v= drgcode1 drgcode2 drgcode3 drgcode4 drgcode5 drgcode6 . COMPUTE v=RND(v). FORMATS v(F1.0). if (v<0) v=0. if (v>3) v=3. END REPEAT . EXECUTE. *--------------------------------means for imputed data----------------------------------------------. TABLES /FORMAT BLANK MISSING('.') /OBSERVATION dprev1 dvar1 dvol1 dprev2 dvar2 dvol2 dprev3 dvar3 dvol3 dprev4 dvar4 dvol4 dprev5 dvar5 dvol5 dprev6 dvar6 dvol6 /TABLES (dprev1+dvar1+dvol1+dprev2+dvar2+dvol2+dprev3+dvar3+dvol3+dprev4+dvar4+dvol4+dprev5+dvar5+dvol5+dprev6+dvar6+dvol6) BY GENDER . *--------------- now a table that shows how data can be inconsistent. *---------------- first select only those with prevalence of offending as sweep 6=0. COMPUTE filter_$=(dprev6=0). FILTER BY filter_$. EXECUTE . CROSSTABS /TABLES=dvar5 BY dvar6 /FORMAT= AVALUE TABLES /CELLS= COUNT /COUNT ROUND CELL . FILTER OFF. USE ALL. EXECUTE . *------------------------------- make 1/0 variable for offending. compute anyoff=2-HZEVREFI. execute. LOGISTIC REGRESSION anyoff /METHOD = ENTER drgcode6 drgcode1 GENDER ETHGP szindep /categorical drgcode6 drgcode1 GENDER ETHGP szindep /CONTRAST (drgcode1)=Indicator(1) /CONTRAST (drgcode6)=Indicator(1) /CONTRAST (szindep)=Indicator(2). *check out tables. CROSSTABS /TABLES= drgcode1 by anyoff /FORMAT= AVALUE TABLES /CELLS= COUNT ROW /COUNT ROUND CELL . CROSSTABS /TABLES= drgcode6 by anyoff /FORMAT= AVALUE TABLES /CELLS= COUNT row /COUNT ROUND CELL . *----------------- now run these models on file with missing data-------------. GET FILE='G:\peas\ex6datafiles\data\ex6.sav'. *------------------------------- make 1/0 variable for offending. compute anyoff=2-HZEVREFI. execute. LOGISTIC REGRESSION anyoff /METHOD = ENTER drgcode6 drgcode1 GENDER ETHGP szindep /categorical drgcode6 drgcode1 GENDER ETHGP szindep /CONTRAST (drgcode1)=Indicator(1) /CONTRAST (drgcode6)=Indicator(1) /CONTRAST (szindep)=Indicator(2). *check out tables. CROSSTABS /TABLES= drgcode1 by anyoff /FORMAT= AVALUE TABLES /CELLS= COUNT ROW /COUNT ROUND CELL . CROSSTABS /TABLES= drgcode6 by anyoff /FORMAT= AVALUE TABLES /CELLS= COUNT row /COUNT ROUND CELL .