Abstract

With the continuous economic development, the total amount of data in banks is growing bigger and bigger and it is urgent and necessary to manage these data effectively. This paper studied Hadoop-based bank historical data management from the perspective of data migration, and designed and realized a data migration module. Map Reduce data migration was used in the process of structured data migration. An IO load-based scheduling algorithm was also designed. When scheduling, considering the consumption of resources, it is avoided to assign tasks to heavy-loaded IO nodes. An unstructured data migration tool was developed by smartly using FTP to concurrently migrate log files of online service platforms and other data to HDFS specified directories. The final system test results show that the system in this paper can work normally and meet the design requirements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call