Most ML Folks Often Neglect This While Using Linear Regression
The effectiveness of a linear regression model is determined by how well the data conforms to the algorithm's underlying assumptions.
One highly important, yet often neglected assumption of linear regression is homoscedasticity.
A dataset is homoscedastic if the variability of residuals (=actual-predicted) stays the same across the input range.
In contrast, a dataset is heteroscedastic if the residuals have non-constant variance.
Homoscedasticity is extremely critical for linear regression. This is because it ensures that our regression coefficients are reliable. Moreover, we can trust that the predictions will always stay within the same confidence interval.
Share this post on LinkedIn: Post Link.
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new about Python and Data Science every day.
Check out Sourcery, an automated code refactoring tool for Python to make your code more elegant, concise, and pythonic.
Find the code for my tips here: GitHub.
I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn.