Abstract
With at least 60 processing cores, the Xeon-Phi coprocessor is a truly multicore architecture, which consists of an interconnection speed among cores of 240 GB/s, two levels of cache memory, a theoretical performance of 1.01 Tflops, and programming flexibility, all making the Xeon-Phi an excellent coprocessor for parallelizing applications that seek to reduce computational times. The objective of this work is to migrate a geophysical application designed to directly calculate the gravimetric tensor components and their derivatives and in this way research the performance of one and two Xeon-Phi coprocessors integrated on the same node and distributed in various nodes. This application allows the analysis of the design factors that drive good performance and compare the results against a conventional multicore CPU. This research shows an efficient strategy based on nested parallelism using OpenMP, a design that in its outer structure acts as a controller of interconnected Xeon-Phi coprocessors while its interior is used for parallelyzing the loops. MPI is subsequently used to reduce the information among the nodes of the cluster.
Highlights
The Xeon-Phi coprocessor is one of the new architectures designed for high performance computing (HPC)
This research shows an efficient strategy based on nested parallelism using OpenMP, a design that in its outer structure acts as a controller of interconnected Xeon-Phi coprocessors while its interior is used for parallelyzing the loops
The tensor results obtained from G are shown in Figure 18; these results offer an important understanding into the definition of the geometry of bodies in order to characterize the structures of economic interest, since the full tensor gravity (FTG) data were able to detect superficial bodies, bodies with measured and exact amplitudes in a range of −10 to 7 Eotvos
Summary
The Xeon-Phi coprocessor is one of the new architectures designed for high performance computing (HPC). The Xeon-Phi is a X86 multicore architecture with low power consumption and with a theoretical performance of one Teraflop, for which it utilizes 60 real processing cores, a 30 MB cache, and high band-width interconnection [1]. One of the current research activities is to analyse the features and benefits of each new technology that emerges across the field of supercomputers, which is based, today, on GPUs and coprocessors. The effort to migrate scientific applications to CUDA (for GPUs) or OpenCL (i.e., programming that requires programming low level kernels) is often much higher as compared to directivebased programming like OpenMP (for CPU or MIC) [2]. Prior experiments which test the functionality of the XeonPhi coprocessor show that migrating scientific software is relatively easy, thereby making the MICs a promising tool for HPC applications [3]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.