On the Distributed Implementation of Unsupervised Extreme Learning Machines for Big Data

Yara Rizk,Mariette Awad

doi:10.1016/j.procs.2015.07.291

Abstract

Abstract The emergence of the big data problem has pushed the machine learning research community to develop unsupervised, distributed and computationally efficient learning algorithms to benefit from this data. Extreme learning machines (ELM) have gained popularity as a neuron based architecture with fast training time and good generalization. In this work, we parallelize an ELM algorithm for unsupervised learning on a distributed framework to learn clustering models from big data based on the unsupervised ELM algorithm proposed in the literature. We propose three approaches to do so: 1) Parallel US-ELM which simply distributes the data over computing nodes, 2) Hierarchical US-ELM which hierarchically clusters the data and 3) Ensemble US- ELM which is an ensemble of weak ELM models. The algorithms achieved faster training times compared to their serial counterparts and generalized better than other clustering algorithms in the literature, when tested on multiple datasets from UCI.

Full Text