Abstract

The virtual machine have memory bottleneck in the application, two mainly aspects are the I/O limit and dynamic virtual machine migtation. GlusterFS is a distributed file storage system with high level and open source uses the Scale-Out architecture and elastic hash algorithm to solve the I/O limit bottleneck. GlusterFS can automatically copy files and provides file sharing service to solve the virtual machine dynamic migration bottleneck. In order to use GlusterFS as the underlying storage devices in a cloud environment, the iozone file system is adopted to test the performance of GlusterFS. The results show that the storage performance of GlusterFS can be improved linear through increasing dynamically the number of physical servers, the speed is stable when multi clients write large files to GlusterFS at the same time, users can define his own data backup number. Therefore, it is a good choice to use GlusterFS to solve the storage bottleneck on virtual machines in a cloud environment. Introduction With the virtualization technology continues to mature, the performance of virtual machine has be similar to performance of physical hosts; it can meet the requirements of many applications. And because the virtual machine have a great advantage in costs and maintenance, many companies and university laboratories deploy daily applications on a virtual machine, reducing the number of physical servers, so the number of virtual server enterprises owned has more than the number of physical serves. But it also brings new problem about storage performance [1]. In the virtual machine applications, storage performance problems mainly include two aspects. One is I/O limit, usually a physical server can run multiple virtual machines, each virtual machine deploys high I/O application request, because each physical disk IOPS's (Input Output Operations Per Second) value is fixed, thus causing the virtual I/O competing for resources. Another is the dynamic migration of virtual machine, in the operation and management, physical servers need regular maintenance shutdown, and a physical server suddenly damaged happen occasionally, so it must ensure that the state of the virtual machine running on this physical server is running without interruption or a brief interruption, won't affect the server running on the b\virtual machine [2]. In order to solve these two problems, common business solution is to use centralized commercial disk storage cabinets, providing a unified storage nodes through commercial products and high IOPS performance to ensure the operation of the virtual machine. However, commercial centralized storage will increase the virtual machine unit input costs, and overly centralized storage will result in I / O storms and other issues. With the development of cloud computing, it has become another option beyond commercial storage solutions to achieve high IOPS and high availability through software. As SDS's (Software-Defined Storage) typical products, open source distributed storage system can easily solve these two problems. Glusterfs Overview GlusterFS is a distributed file system, released under the GPL license. The project started in 2005 by Gluster company. In 2011, Red Hat acquired Gluster company, the project is currently dominated by Red Hat. Currently, GlusterFS has been adopted in the Red Hat Storage Server. International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2015) © 2015. The authors Published by Atlantis Press 199 GlusterFS is an open source and high-level distributed file system, Its core is the horizontal structure of Scale-Out storage solution. By simply adding resources to increase storage capacity and performance, disk, compute and I/O resources can increase independently, and support 10GbE and InfiniBand high-speed network. By extending the memory, GlusterFS can support a number of PBstorage capacity and processing thousands of clients. GlusterFS uses a single global namespace manage data; can provide excellent performance for a variety of data load. It is based on stackable user space design, gathering together physically distributed storage resources With TCP / IP or InfiniBand RDMA network. GlusterFS allows users to get rid of the original independent, high-cost closed storage system, using ordinary cheap storage devices deploy a centrally managed, scale, virtualized storage pool, and the storage capacity can be expanded TB / PB grade. GlusterFS also has strong scalability and reliability as well [3]. Compared with traditional NAS(Network Attached Storage), SAN(Storage Area Network), RAID(Redundant Arrays of Inexpensive Disks), GlusterFS has advantages as following: Expansion of capacity can be scaled, and the performance will not be reduced. Inexpensive and simple to use, completely abstract in on top of existing file system. Expansion and fault-tolerant design was more reasonable, low complexity. Expansion of the use of translator mode, using the scheduling interface extended scheduling, fault tolerance to a local file system processing[4].Adaptability, ease of deployment, low dependence on the environment, use, commissioning and maintenance facilities. Support mainstream Linux system releases, including fc, ubuntu, debian, suse, etc., and there are a number of successful applications. GlusterFS has methods to resolve storage performance bottlenecks. First of all, for I/O Limits, GlusterFS use Scale-Out framework and flexibility hash algorithm solves the storage I / O bottlenecks. GlusterFS has a linear lateral extension, dynamically increasing the number of storage nodes in the cluster to improve I/O performance of the storage pool by non-stop running service. Virtual machine I/O automatic load balancing in the storage pool by elastic hash algorithm[5], allocating virtual machines running on different storage nodes in the storage pool, effectively avoid the inter-VM snatch I / O resource bottlenecks. Secondly, for virtual machine live migration problem, GlusterFS can automatically copy the file to ensure that data can always be accessed, even in the event of hardware failure(Including hardware, disk, network failures and data corruption caused by misuse administrators, etc.) can be normal access. Self-healing capabilities can restore the data to the correct state, and repair is an incremental way in the background, almost cause no performance load. This ensure that virtual machine will still running in case of equipment failure[6]. GlusterFS provides file-level shared services, saving the virtual machine image files on GlusterFS. When a physical compute node damage, the virtual machine image file is not affected, as long as the short stop to restore a virtual machine; and virtual machine can also directly migrate to another physical computing node without stopping when In the daily maintenance for the calculation of physical nodes, that is realization of dynamic virtual machine migration. Glusterfs Performance Test Environment In order to use GlusterFS as the underlying storage devices in a cloud environment, using IOzone file system I / O performance test tools to test GlusterFS. To assess whether GlusterFS is suitable for shared memory platform of virtual machine images, providing reference on how to use the GlusterFS in cloud environment. The configuration of the physical server is that 16 * Intel(R) Xeon(R) CPU E5-2650@ 2.00GHz,128GB RAM,Ubuntu1204 LTS_AMD64_SERVER Operating System,3T×12 + 300G×2 SATA II Disk,GlusterFS version 3.2.5 and Dell PERC 7/E Storage Controller. The configuration of the client is that Intel(R) Xeon(R) CPU E5-2650 @ 2.00GHz,8 GB RAM,Ubuntu1204 LTS Operating System and IOzone version .397(Compiled for 64 bit mode ).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call