Abstract

Scientists, researchers, engineers, etc. almost everyone who works with data crosses paths with Pandas at some point. It is so powerful library that allows for easy, rapid and efficient manipulation of data. It can convert data it represent into various file types. Among these file types, the determination of the one which records the same Pandas data with the smallest size on the disk is an important issue considering the abundance of today's data. In this study, the file types that can save Pandas data with minimum size has been experimentally investigated from various perspectives. In this respect, the CSV, HDF, JSON, Excel and Pickle file types are involved in the experiments. The sizes of these files were benchmarked under several conditions such as the completeness or lack of data and type of variables that are contained in data. In addition, it was also examined that how file sizes vary as data increases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.