Abstract

When solving many applied and research problems, it becomes necessary to work with multidimensional arrays (tensors). In practice, an efficient and compact representation of these objects is used in the form of so-called ten­sor trains. The paper considers a parallel implementation of the TT-cross algorithm, which allows one to obtain a decomposition of a multidimensional array into a tensor train using a graphics processor of the CUDA architecture. The main aspects and features of parallelization and implementation of the algorithm are presented. The obtained parallel implementation was tested on a representative number of examples. A significant reduction in computational time is demonstrated in comparison with a similar sequential implementation of the algorithm, which indicates the effectiveness of the proposed approaches to parallelization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call