Daily Dose of Data Science

Share this post

Use Box Plots With Caution! They May Be Misleading.

www.blog.dailydoseofds.com

Discover more from Daily Dose of Data Science

High-quality insights on Data Science and Python, along with best practices — shared daily. Get a 550+ Page Data Science PDF Guide and 450+ Practice Questions Notebook, FREE.
Over 36,000 subscribers
Continue reading
Sign in

Use Box Plots With Caution! They May Be Misleading.

Avi Chawla
Mar 2, 2023
5
Share this post

Use Box Plots With Caution! They May Be Misleading.

www.blog.dailydoseofds.com
Share

Box plots are quite common in data analysis. But they can be misleading at times. Here's why.

A box plot is a graphical representation of just five numbers – min, first quartile, median, third quartile, and max.

Thus, two different datasets with similar five values will produce identical box plots. This, at times, can be misleading and one may draw wrong conclusions.

The takeaway is NOT that box plots should not be used. Instead, look at the underlying distribution too. Here, histograms and violin plots can help.

Lastly, always remember that when you condense a dataset, you don't see the whole picture. You are losing essential information.

Share this post on LinkedIn: Post Link.

Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new about Python and Data Science every day.


Mito, the no-code data spreadsheet, has started its data science blog. Do check it out here: Mito Blog.

Find the code for my tips here: GitHub.

I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn.

5
Share this post

Use Box Plots With Caution! They May Be Misleading.

www.blog.dailydoseofds.com
Share
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Avi Chawla
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing