Content Based Automated File Organization Using Machine Learning燗pproaches

Syed Ali Raza,Sagheer Abbas,Taher M Ghazal,Hussam Al Hamadi,Munir Ahmad,Muhammad Adnan Khan

doi:10.32604/cmc.2022.029400

Abstract

In the world of big data, it's quite a task to organize different files based on their similarities. Dealing with heterogeneous data and keeping a record of every single file stored in any folder is one of the biggest problems encountered by almost every computer user. Much of file management related tasks will be solved if the files on any operating system are somehow categorized according to their similarities. Then, the browsing process can be performed quickly and easily. This research aims to design a system to automatically organize files based on their similarities in terms of content. The proposed methodology is based on a novel strategy that employs the charactaristics of both supervised and unsupervised machine learning approaches for learning categories of digital files stored on any computer system. The results demonstrate that the proposed architecture can effectively and efficiently address the file organization challenges using real-world user files. The results suggest that the proposed system has great potential to automatically categorize almost all of the user files based on their content. The proposed system is completely automated and does not require any human effort in managing the files and the task of file organization become more efficient as the number of files grows.

Full Text