Multicollinearity in Binary Logistic Regression Model

One of the major issues with the binary logistic regression model is that the explanatory variables that are taken into account for the logistic regression model are strongly correlated. Multicollinearity affects confidence intervals and hypothesis tests by causing unstable estimates and inconsistent variances. The aim of this paper was to address some diagnostic metrics for detecting multicollinearity, such as tolerance and the Variance Inflation Factor. Variance proportions, condition index, and (VIF). Data from a survey of road injuries is used to demonstrate the adapted diagnostics. Secondary data for this analysis was obtained from the Traffic Police Headquarters in Colombo, Sri Lanka, from 2014 to 2016. The answer component is accident severity, which is divided into two categories: grave and non-grave. The correlation matrix, tolerance, and VIF are used to detect multicollinearity. The condition index and variance proportions were used to validate the values. Increase the sample size, remove one of the associated variables, and combine variables into an index are some of the logistic regression solutions available. It can be safely inferred that omitting one of the associated variables can significantly reduce multicollinearity without raising sample size. As a result, a stable and valid predictive logistic regression model can be developed based on adequate inspection and anti-multicollinearity steps.

Author(s) Details

Ms. N. A. M. R. Senaviratna
Department of Mathematics, The Open University of Sri Lanka, Sri Lanka.

Mr. T. M. J. A. Cooray
Department of Mathematics, University of Moratuwa, Sri Lanka.

View Book :-

Leave a Reply

Your email address will not be published. Required fields are marked *