Abstract

Most machine learning algorithms need to handle large datasets. This feature often leads to limitations on processing time and memory. The Expectation-Maximization (EM) is one of such algorithms, which is used to train one of the most commonly used parametric statistical models, the Gaussian Mixture Models (GMM). All steps of the algorithm are potentially parallelizable once they iterate over the entire data set. In this work, we propose a parallel implementation of EM for training GMM using CUDA cores. Experimentation scenario consists of five different datasets and four metrics. Results show a speedup of 12.7 if compared to sequential version. With coalesced access to CUDA global memory and shared memory usage, we have achieved up to 99.4% of actual occupancy, regardless the number of Gaussians considered.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.