Nowadays, huge devices are connected to the internet to deal with many cloud environments for storing data efficiently. Also, it is used to store big data determined in business activities. However, the data storage in the cloud avoids duplicate copies of files and reduces complexity issues. However, this platform is utilized by many users for storing the file in cloud and this creates a data redundancy problem. Also, fingerprint indexing is a major issue in data storage in the cloud. So in order to overcome this issue a novel Enhanced Arithmetic Optimization Algorithm with Mixed-Kernel-based Extreme Learning Machine-based Random Forest (EAOA-MKELM-RF) method is proposed that overcome the deduplication issues. In spite, the scalability issues are solved, and the limitations determined in fingerprint indexing. Further, the pipelining and parallelizing hash function values are utilized to address the scalability issue in the deduplication process. The data storage container organizes the programs and files into data blocks, metadata files and nonduplicate blocks. However, the proposed methods determine the similarity issues and accelerate the fingerprint queries for better scalability. The datasets employed to evaluate the proposed approach are CorrAL-100 dataset, Madelon dataset, XOR-100 dataset, Infocom’05 dataset and Linux kernel dataset. The experimentation results demonstrated that the proposed method attained a superior accuracy by 96.8% in terms of restoration accuracy, robustness and delivery ratio. Additionally, the proposed approach reduces buffer time, improving overall system performance. This validation optimized the storage resources, diminished storage costs and improved overall data management.