Abstract
AbstractUsing the randomization as the data placement algorithm has many advantages such as simple computation, long term load balancing, and little costs. Especially, some latest works have improved it to make it scale well while adding or deleting disks in large storage systems such as SAN (Storage Area Network). But it still has a shortcoming that it can not ensure load balancing in the short term when there are some very hot data blocks accessed frequently. This situation can often be met in Web environments. To solve the problem, based on the algorithm of randomization, an algorithm to select the hot-spot data blocks and a data placement scheme based on the algorithm are presented in this paper. The difference is that it redistributes a few very hot data blocks to make load balanced in any short time. Using this method, we only need to maintain a few blocks status information about their access frequency and more than that it is easy to implement and costs little. A simulation model is implemented to test the data placement methods of our new one and the one just using randomization. The real Web log is used to simulate the load and the results show that the new distributing method can make disks’ load more balanced and get a performance increased by at most 100 percent. The new data placement algorithm will be more efficient in the storage system of a busy Web server.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.