Discover more from Daily Dose of Data Science
Generate Your Own Fake Data In Seconds
Usually, for executing/testing a pipeline, we need to provide it with some dummy data.
Although using Python's "𝐫𝐚𝐧𝐝𝐨𝐦" library, one can generate random strings, floats, and integers. Yet, being random, it does not output any meaningful data such as people's names, city names, emails, etc.
Here, looking for open-source datasets can get time-consuming. Moreover, it's possible that the dataset you find does not fit pretty well into your requirements.
The 𝐅𝐚𝐤𝐞𝐫 module in Python is a perfect solution to this. Faker allows you to generate highly customized fake (yet meaningful) data quickly. What's more, you can also generate data specific to a demographic.
Read more here: Documentation.
Read this post on LinkedIn: Post Link.
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.