Abstract

The growth of the information and networking technologies such as mobile devices, social media, and cloud applications required an efficient management big data system. Moreover, especially storage system because of its great importance in terms of the cost of big data analysis to provide techniques which can search, access, process, retrieve, and then distribute data item in speed manner. So, the most important technique for dealing with big data storage in the cloud is Indexing and distribution. The continuous increase in big data requires improved indexing and distribution systems, which motivate new optimized indexing schemes. In general, most of these techniques categories the indexing techniques of big data to three general categories: Artificial Intelligent indexing (AI), Non-Artificial Intelligent indexing (NAI), and Collaborative Artificial Intelligent (CAI) indexing. The goal of this paper is to provide readers with a systematic understanding of insights the emerging techniques. This article provides a thorough survey of the storage in a big data system regarding the two most important terms in the storage (Indexing and Distribution) to gain in-depth knowledge for the existing hashing techniques with clarification of the main characteristics and how can use hash indexing. It also provides an overview of Indexing in the Hadoop framework, especially in HDFS, for future development.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.