Discover more from Daily Dose of Data Science
Configure Sklearn To Output Pandas DataFrame
Recently, Scikit-learn announced the release of one of the most awaited improvements. In a gist, sklearn can now be configured to output Pandas DataFrames instead of NumPy arrays.
Until now, Sklearn's transformers were configured to accept a Pandas DataFrame as input. But they always returned a NumPy array as an output. As a result, the output had to be manually projected back to a Pandas DataFrame.
Now, the 𝐬𝐞𝐭_𝐨𝐮𝐭𝐩𝐮𝐭 API will let transformers output a Pandas DataFrame instead.
This will make running pipelines on DataFrames smoother. Moreover, it will provide better ways to track feature names.
P.S. The feature is still in dev though and will be rolled out soon!
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.
Read more here: Release page.
Read this post on LinkedIn: Post link.
I like to explore, experiment, and write about data science concepts and tools. You can connect with me on LinkedIn.