Speed-up Pandas Apply 5x with NumPy
While creating conditional columns in Pandas, we tend to use the ๐๐ฉ๐ฉ๐ฅ๐ฒ() method almost all the time.
However, ๐๐ฉ๐ฉ๐ฅ๐ฒ() in Pandas is nothing but a glorified for-loop. As a result, it misses the whole point of vectorization.
Instead, you should use the ๐ง๐ฉ.๐ฌ๐๐ฅ๐๐๐ญ() method to create conditional columns. It does the same job but is extremely fast.
The conditions and the corresponding results are passed as the first two arguments. The last argument is the default result.
Read more here: NumPy docs.
Share this post on LinkedIn: Post Link.
Find the code for my tips here:ย GitHub.
I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn.


