Abstract
Abstract Many parallel applications do not scale with the number of threads. Several online and offline strategies have been proposed in order to optimize this number. While the former strategy can capture some behaviors that can only be known at runtime, the latter do not impose any execution overhead and can use more complex and efficient algorithms. However, the learning algorithm in these offline strategies may take several hours, precluding their use or a smooth portability across different systems. In this scenario, we propose a methodology to decrease the learning time of offline strategies by inferring the execution behavior of parallel applications using smaller input sets than the ones used by the target applications. It implements two search strategies: SEA, where all parallel regions of an application run with the same number of threads; and SPRA, which seeks to find an ideal number of threads for each parallel region of a given application. With an extensive set of experiments, we show that SEA and SPRA strategies converge to results close to an offline approach applied over the regular input, but being 88% and 87% faster, on average, respectively. We also show that SPRA is better than SEA for unbalanced applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.