Discover more from Daily Dose of Data Science
Why you should not dump DataFrames to a CSV
The CSV file format is widely used to save Pandas DataFrames. But are you aware of its limitations? To name a few,
1. The CSV does not store the datatype information. Thus, if you modify the datatype of column(s), save it to a CSV, and load again, Pandas will not return the same datatypes.
2. Saving the DataFrame to a CSV file format isn't as optimized as other supported formats by Pandas. These include Parquet, Pickle, etc.
Of course, if you need to view your data outside Python (Excel, for instance), you are bound to use a CSV. But if not, prefer other file formats.
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.
Further reading: Why I Stopped Dumping DataFrames to a CSV and Why You Should Too.