# Simple Linear Regression Module Exercise

« Previous page Page 10 of 10

The following three questions are essentially research questions. You can work through them using the LSYPE 15,000 dataset and your new found statistical super powers! We recommend that you answer them in full sentences with supporting tables or graphs where appropriate – this will help when you come to report your own research. There is a link to the answers at the bottom of the page.

Note: The variable names as they appear in the SPSS dataset are listed in brackets.

Question 1

Is there a statistically significant association between whether or not a student has been truant in the last twelve months (truancy) and whether or not they achieve 5 GCSE qualifications of A* - C, including maths and English (fiveem)?

Use crosstabulation and a chi-square analysis to answer this question and provide the supporting statistics.

Question 2

We have seen that age 11 exam scores have a strong positive correlation with age 14 exam scores. Is there similar association between age 11 (ks2stand) exam scores and exams scores at age 16 (ks4stand)?

Draw a scatterplot and perform a bivariate correlation to answer this question.

Question 3

The Income Deprivation Affecting Children Index (IDACI) provides a standardised measure of how relatively deprived the students in the LSYPE sample are. The mean score is 0 (the measure is standardised) while negative values represent students from relatively affluent backgrounds and positive values students from relatively poor backgrounds. Can IDACI score (IDACI_n) be used to predict students’ exam scores at age 14 (ks3stand)?

Perform a simple linear regression using ks3stand as the outcome variable and IDACI_n as the explanatory variable to answer this question. Be sure to check that the assumptions of the analysis are met.