PARALLEL IMPLEMENTATION OF EXPECTATION-MAXIMISATION ALGORITHM FOR THE TRAINING OF GAUSSIAN MIXTURE MODELS

G F Araújo,M T Chella,M V.O Medeiros,C A.E Montesco,H T Macedo

doi:10.3844/jcssp.2014.2124.2134

Abstract

Most machine learning algorithms need to handle large data sets. This feature often leads to limitations on processing time and memory. The Expectation-Maximization (EM) is one of such algorithms, which is used to train one of the most commonly used parametric statistical models, the Gaussian Mixture Models (GMM). All steps of the algorithm are potentially parallelizable once they iterate over the entire data set. In this study, we propose a parallel implementation of EM for training GMM using CUDA. Experiments are performed with a UCI dataset and results show a speedup of 7 if compared to the sequential version. We have also carried out modifications to the code in order to provide better access to global memory and shared memory usage. We have achieved up to 56.4% of achieved occupancy, regardless the number of Gaussians considered in the set of experiments.

Highlights

The clear advantage of using Graphical Processing Units (GPUs) is the small costs if compared to clusters or supercomputers and its Machine Learning (ML) algorithms are often costly, since learning is a task that requires a large amount of knowledge and constant improvement of it, requiring massive data computation
To the approach of (Machlica et al, 2011) and (Kumar et al, 2009), in our proposal the main loop of the algorithm is implemented sequentially and different CUDA kernels are in charge of running different steps of the algorithm
We have used the dataset Arabic Spoken Digit 3 from UCI Repository in order to test the algorithm implementation. This dataset consists of instances with 13 Mel Frequency Cepstral Coefficients (MFCC), widely used to represent audio signals in speech processing systems, which commonly use Gaussian Mixture Models (GMM) to model the distribution of phones in the language

Summary

Introduction

The clear advantage of using GPUs is the small costs if compared to clusters or supercomputers and its Machine Learning (ML) algorithms are often costly, since learning is a task that requires a large amount of knowledge and constant improvement of it, requiring massive data computation. A major problem of massive computing is the limitation of mainstream processing power if compared to multi-core processors. Even the former NVIDIATM GeForceTM 8400 GS graphics card, for instance, is able to run up to 32 threads in parallel per clock cycle, under some restrictions. Such limitation can be overcome using a parallel domains can be observed in recent scientific literature processing of data provided on newer architectures. One of these recent architecture is the NVIDIATM. It is possible to use the CUDA-C programming language, for instance, to provide a parallelized source code

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computer Science	Publication Date: Oct 1, 2014
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

PARALLEL IMPLEMENTATION OF EXPECTATION-MAXIMISATION ALGORITHM FOR THE TRAINING OF GAUSSIAN MIXTURE MODELS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science

Lead the way for us

Similar Papers

Multi-kernel approach to Parallelization of EM Algorithm for GMM Training
Marcus Medeiros ... Hendrik Macedo
-
Marcus Medeiros, et. al.Marcus Medeiros ... Hendrik Macedo
01 Oct 2014
01 Oct 2014

Comparison of Weibull and Gaussian Mixture Models for Wind Speed Data Analysis
Muhammad Shoaib ... Zia Ibrahim
International Journal of Economic and Environmental Geology | VOL. 11
Muhammad Shoaib, et. al.Muhammad Shoaib ... Zia Ibrahim
06 Jul 2020
International Journal of Economic and Environmental Geology | VOL. 11

Comparison of Weibull and Gaussian Mixture Models for Wind Speed Data Analysis
Imran Siddiqui ... Zia Ibrahim
International Journal of Economic and Environmental Geology | VOL. 11
Imran Siddiqui, et. al.Imran Siddiqui ... Zia Ibrahim
06 Jul 2020
International Journal of Economic and Environmental Geology | VOL. 11

Probability density function for wave elevation based on Gaussian mixture models
Zhe Gao ... Shuxiu Liang
Ocean Engineering | VOL. 213
Zhe Gao, et. al.Zhe Gao ... Shuxiu Liang
25 Jul 2020
Ocean Engineering | VOL. 213

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PARALLEL IMPLEMENTATION OF EXPECTATION-MAXIMISATION ALGORITHM FOR THE TRAINING OF GAUSSIAN MIXTURE MODELS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science