Abstract

The Parallel and Distributed Computing group belonging to the Integrated Technological Research Complex (CITI). has been engaged in the creation of general-purpose components that support the processing of large volumes of information that characterize the problems involved in parallel computing. Using the oblivious cache model, which works independently of the computer architecture, and the divide and conquer principle, an algorithm for matrix transposition is implemented to reduce the execution time of this algebraic operation. The algorithm ensures that most of the data content is loaded to the cache for fast processing, and makes the most of its stay in the cache to minimize missed reads and achieve greater speed. The work includes conclusions and statistical tests carried out from experiments on computers with different architectures, reflecting the superiority of the algorithm that uses oblivious cache from an order of matrix determined according to the characteristics of each PC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call