Abstract

Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.