
The short answer is that you won’t be able to trust the predictions from your model and it could give weak or even misleading results.
The longer answer is that it depends on the type of violation.
If linearity is violated, and the relationship between the variables isn’t linear after all, there will likely be a larger range of values. The mean of the estimates might still be the same, but a model where linearity was violated will ultimately be less precise.
If multicolinearity is violated and there are at least 2 highly correlated variables, it actually won’t affect the predictions. However, it will affect the coefficient estimates, and therefore reduce the interpretability of the model.
If there is not constant variance of the residuals i.e. the assumptions of homoscedasticity is violated, then predictions will still be accurate but evaluation with Mean Squared Error will be wrong, so it will be difficult to compare models.
Finally, the normality assumption depends. If there is a large enough sample size, then Central Limit Theorem can take over and honestly we don’t have to worry about it too much. if there is a small sample size, then the predictions of the model are not reliable.
Ultimately, the assumptions should always be upheld in order to have a reliable and interpretable model. When assumptions are violated, please check the features to be sure that linear regression is actually the right model and if it isn’t then change the model.
Related Posts –
- What are the assumptions of OLS Linear Regression?
- Introduction to Linear Regression in Machine Learning