Thursday, April 25, 2013

Titanic Data Competition - Submission 7

Best score yet yesterday.

First, I had to try and replicate my previous best score. Need to take more care in documenting my models. In the end, I found I had saved the SPSS output, and that had the model details.

My previous best score was 0.78947

I then retried the best score model, just substituting combined age with regression age. This scored 0.77512.

Next, I added family to the independent variables (family is the total of sibsp and parch). This moved me up 211 places, to position 124. The model scored 0.79904. The SPSS classification table had 83.2% correctly classified.

Next, I added adj cabin as an independent variable. While the SPSS classification table showed 84.2% correctly classified, this model scored only 0.78469.

Finally, I took out adj cabin and added Age_Sex_Class - this 3 way interaction had 84% correctly classified as per SPSS, but scored 0.79426 - which whilst not my best score, was the best compared to previous day's best.

Challenge now is to move score into the 0.80000 range !

