Sunday, June 2, 2013

Logistic Regression Breaks Through The 0.80 Barrier

I had been stuck at 0.79904 for a while and was wondering whether logistic regression was going take me any further. Well, today's results showed me there was still some life left in boring old logistic regression.

Initially tried standardising the continuous variables (fare, combined age, fare per person, family and age_class interaction) but that didn't help.

Then added age squared to the model - that lifted my result to 0.80383.

Added "fare squared" and "fare per person squared" - that didn't help.

Finally added "age class interaction squared" and that took my score to 0.80861.

So I'm ranked equal 71 in the competition. There are 27 competitors ranked 71.

Challenge now is to see what further improvements can be made to the logistic regression model.

glm(formula = survived ~ male + pclass + fare + fare_per_person + 
    Title + age_class.interaction + sex_class + combined_age + 
    family + age_squared + age_class_squared, family = binomial(), 
    data = train)


No comments:

Post a Comment