Reduce Memory Usage Of A Pandas DataFrame By 90%
By default, Pandas always assigns the highest memory datatype to its columns. For instance, an integer-valued column always gets the int64 datatype, irrespective of its range.
To reduce memory usage, represent it using an optimized datatype, which is enough to span the range of values in your columns.
Read this blog for more info. It details many techniques to optimize the memory usage of a Pandas DataFrame.
Share this post on LinkedIn: Post Link.
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.
Find the code for my tips here: GitHub.
I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn.