Daily Dose of Data Science

Share this post

What Makes The Join() Method Blazingly Faster Than Iteration?

www.blog.dailydoseofds.com

Discover more from Daily Dose of Data Science

High-quality insights on Data Science and Python, along with best practices — shared daily. Get a 550+ Page Data Science PDF Guide and 450+ Practice Questions Notebook, FREE.
Over 36,000 subscribers
Continue reading
Sign in

What Makes The Join() Method Blazingly Faster Than Iteration?

A reminder to always prefer specific methods over a generalized approach.

Avi Chawla
Jun 18, 2023
22
Share this post

What Makes The Join() Method Blazingly Faster Than Iteration?

www.blog.dailydoseofds.com
1
Share

There are two popular ways to concatenate multiple strings:

  1. Iterating and appending them to a single string.

  2. Using Python’s in-built join() method.

But as shown above, the 2nd approach is significantly faster than the 1st approach.

Here’s why (or maybe stop reading here and try to guess before you read ahead).


When iterating, Python naively executes the instructions it comes across.

Thus, it does not know (beforehand):

  • number of strings it will concatenate

  • number of white spaces it will need

In other words, iteration inhibits the scope for optimization.

As a result, at every iteration, Python asks for a memory allocation of:

  • the string at the current iteration

  • the white space added as a separator

Memory allocation at each iteration

This leads to repeated calls to memory. To be precise, the number of calls, in this case, is two times the size of the list.

However, with join(), Python precisely knows (beforehand):

  • number of strings it will be concatenating

  • number of white spaces it will need

Memory allocation at once

All these are applied for allocation in a single call and are available upfront before concatenation.

To summarize:

  • with iteration, the number of memory allocation calls is 2x the list's size.

  • with join(), the number of memory allocation calls is just one.

This explains the significant difference in their run-time.

This post is also a reminder to always prefer specific methods over a generalized approach. What do you think?

👉 Over to you: What other ways do you commonly use to optimize native Python code?

Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (350+ pages) with 250+ tips.


👉 Read what others are saying about this post on LinkedIn and Twitter.

👉 Tell the world what makes this newsletter special for you by leaving a review here :)

Review Daily Dose of Data Science

👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.

👉 If you love reading this newsletter, feel free to share it with friends!

Share Daily Dose of Data Science

👉 Sponsor the Daily Dose of Data Science Newsletter. More info here: Sponsorship details.


Find the code for my tips here: GitHub.

I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn and Twitter.

22
Share this post

What Makes The Join() Method Blazingly Faster Than Iteration?

www.blog.dailydoseofds.com
1
Share
Previous
Next
1 Comment
Share this discussion

What Makes The Join() Method Blazingly Faster Than Iteration?

www.blog.dailydoseofds.com
Omar AlSuwaidi
Writes Omar’s Substack
Jun 18Liked by Avi Chawla

Always great to refresh knowledge about memory allocation and freeing, especially in loops!

Expand full comment
Reply
Share
Top
New
Community

No posts

Ready for more?

© 2023 Avi Chawla
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing