Discover more from Daily Dose of Data Science
8 Classic Alternatives to Traditional Plots That Every Data Scientist Must Add in Their Visualisation Toolkit
A consolidated guide on best plotting ideas discussed here.
Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (550+ pages) with 320+ tips.
Scatter plots, bar plots, line plots, box plots, and heatmaps are the most frequently used plots for data visualization.
Although they are simple and known to almost everyone, I believe they are not the right choice to cover every possible scenario.
Instead, many other plots originate from these standard plots that can be much more suitable, if used appropriately.
Therefore, today, let’s discuss a few alternatives to these popular plots.
I will also explain specific situations where they can be more useful over standard plots.
This post is a consolidation of some of my previous plotting posts published in this newsletter.
If you have never seen them before, then there’s new information for you.
If you have seen them before, then this will be a good referesher for you.
In any case, a consolidated guide will be quite useful to look back later instead of scrolling through individual newsletter issues.
Also, before I begin, this post is not intended to discourage the use of these traditional plots. They will always have there place.
Instead, it is to highlight specific situations where they can replaced with better plotting ideas.
#1) Size-encoded heatmaps
A traditional heatmap represents the values using a color scale. Yet, mapping the cell color to exact numbers is still challenging.
In essence, the bigger the size, the higher the absolute value:
This is especially useful to make heatmaps cleaner, as many values nearer to zero will immediately shrink.
#2) Waterfall charts
To visualize the change in value over time, a line (or bar) plot may not always be an apt choice.
This is because a line plot (or bar plot) depicts the actual values in the chart. Thus, it is difficult to visually estimate the scale and direction of incremental changes.
It elegantly depicts these rolling differences, as depicted below:
Here, the start and final values are represented by the first and last bars.
Also, the consecutive changes are automatically color-coded, making them easier to interpret.
#3) Bump charts
When visualizing the change in rank over time of multiple categories, using a bar chart may not be appropriate.
This is because bar charts quickly become cluttered with many categories.
Instead, try Bump Charts. They are specifically used to visualize the rank of different items over time.
Comparing the bar chart and bump chart above, it is far easier to interpret the change in rank with a bump chart rather than a bar chart.
#4) Raincloud Plots
Visualizing data distributions using box plots and histograms can be misleading at times.
This is because:
Thus, to avoid misleading conclusions, it is always recommended to plot the data distribution as precisely as possible.
Box plots for data statistics.
Strip plots for data overview.
KDE plots for the probability distribution of data.
With Raincloud plots, you can:
Combine multiple plots to prevent incorrect/misleading conclusions
Reduce clutter and enhance clarity
Improve comparisons between groups
Capture different aspects of the data through a single plot
#5-6) Hexbin and Density Plots
Scatter plots can get too dense to interpret when you have thousands of data points.
Instead, you can replace them with Hexbin plots.
Hexbin plots bin the area of a chart into hexagonal regions. Each region is assigned a color intensity based on the method of aggregation used (the number of points, for instance).
Another choice is a density plot, which illustrates the distribution of points in a two-dimensional space.
A contour is created by connecting points of equal density. In other words, a single contour line depicts an equal density of data points.
#7-8) Bubble charts and Dot plots
As discussed above, bar plots quickly get messy and cluttered as the number of categories increases.
They are like scatter plots but:
with one categorical axis
and one continuous axis
As depicted above:
It is difficult to interpret the bar plot because it has too many bars packed into a small space,
But size-encoded bubbles make it pretty easy to visualize the change over time.
Both dot plots and bubble charts are based on the idea that, at times, when we have a bar plot with many bars, we’re often not paying attention to the individual bar lengths.
Instead, we mostly consider the individual endpoints that denote the total value.
These plots precisely help us depict that while also eliminating the long bars of little to no use.
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (550+ pages) with 320+ tips.
👉 Over to you: Are there any other lesser-known yet valuable plots that I haven’t covered here. If yes, when do you use them?
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights.
The button is located towards the bottom of this email.
Thanks for reading!
Latest full articles
If you’re not a full subscriber, here’s what you missed last month:
To receive all full articles and support the Daily Dose of Data Science, consider subscribing:
👉 Tell the world what makes this newsletter special for you by leaving a review here :)
👉 If you love reading this newsletter, feel free to share it with friends!