Abstract

Distributing column storage is one of the techniques to improve the efficiency of big data access under the cloud computing environment. To achieving the aim and reducing network data access frequency, paper established a data localization strategy and designed a multi-thread algorithm. Firstly, segmentalize data in the horizontal direction, and then divide vertically the data table into data column, and ensure that the same level column data localize on the same node in the cluster. Secondly, the essay designed and realized the data localization algorithm under Hadoop distributed cloud computing framework. Finally, experiments show remarkable reduces in the network access with the usage of data localization algorithm, and improvement of the data access efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.