Discover more from Daily Dose of Data Science
The Most Common Misconception About Inplace Operations in Pandas
...and here's what happens in reality.
Pandas users often modify a DataFrame inplace expecting better performance. Yet, it may not always be efficient. Here's why.
The image compares the run-time of inplace and non-in-place operations. In most cases, inplace operations are slow.
Contrary to common belief, most inplace operations DO NOT prevent the creation of a new copy. It is just that inplace assigns the copy back to the same address.
But during this assignment, Pandas performs some extra checks (SettingWithCopy) to ensure that the DataFrame is being modified correctly. This, at times, can be an expensive operation.
Yet, in general, there is no guarantee that an inplace operation is faster.
What’s more, inplace operations do not allow chaining multiple operations, such as this:
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.
👉 If you love reading this newsletter, feel free to share it with friends!
Find the code for my tips here: GitHub.