Supercharge value_counts() Method in Pandas With Sidetable
The ๐ฏ๐๐ฅ๐ฎ๐_๐๐จ๐ฎ๐ง๐ญ๐ฌ() method is commonly used to analyze categorical columns, but it has many limitations.
For instance, if one wants to view the percentage, cumulative count, etc., in one place, things do get a bit tedious. This requires more code and is time-consuming.
Instead, use ๐ฌ๐ข๐๐๐ญ๐๐๐ฅ๐. Consider it as a supercharged version of ๐ฏ๐๐ฅ๐ฎ๐_๐๐จ๐ฎ๐ง๐ญ๐ฌ(). As shown below, the ๐๐ซ๐๐ช() method from sidetable provides a more useful summary than ๐ฏ๐๐ฅ๐ฎ๐_๐๐จ๐ฎ๐ง๐ญ๐ฌ().
Additionally, sidetable can aggregate multiple columns too. You can also provide threshold points to merge data into a single bucket. What's more, it can print missing data stats, pretty print values, etc.
Read more: GitHub.
Share this post on LinkedIn: Post Link.
I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn.