Discover more from Daily Dose of Data Science
Visualize The Performance Of Any Linear Regression Model With This Simple Plot
Assumption turned into performance validation.
Linear regression assumes that the model residuals (=actual-predicted) are normally distributed.
If the model is underperforming, it may be due to a violation of this assumption.
A residual distribution plot is a great way to verify this and also determine the model's performance.
As the name suggests, it depicts the distribution of residuals (=actual-predicted).
A good residual plot will:
Follow a normal distribution
NOT reveal trends in residuals
A bad residual plot will:
Reveal patterns in residuals
Thus, the more normally distributed the residual plot looks, the more confident you can be about your model.
This is especially useful when the regression line is difficult to visualize, i.e., in a high-dimensional dataset.
After running a linear model, always check the distribution of the residuals.
This will help you:
Validate the model's assumptions
Determine how good your model is
Find ways to improve it (if needed)
👉 Over to you: What are some other ways/plots to determine the linear model's performance?
Thanks for reading!
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.
👉 Tell the world what makes this newsletter special for you by leaving a review here :)
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.
👉 If you love reading this newsletter, feel free to share it with friends!
Find the code for my tips here: GitHub.