Abstract

In the field of machine learning, offline training and online training occupy the same important position because they coexist in many real applications. The extreme learning machine (ELM) has the characteristics of fast learning speed and high accuracy for offline training, and online sequential ELM (OS-ELM) is a variant of ELM that supports online training. With the explosive growth of data volume, running these algorithms on distributed computing platforms is an unstoppable trend, but there is currently no efficient distributed framework to support both ELM and OS-ELM. Apache Flink is an open-source stream-based distributed platform for both offline processing and online data processing with good scalability, high throughput, and fault-tolerant ability, so it can be used to accelerate both ELM and OS-ELM. In this paper, we first research the characteristics of ELM, OS-ELM and distributed computing platforms, then propose an efficient stream-based distributed framework for both ELM and OS-ELM, named ELM-SDF, which is implemented on Flink. We then evaluate the algorithms in this framework with synthetic data on distributed cluster. In summary, the advantages of the proposed framework are highlighted as follows. (1) The training speed of FLELM is always faster than ELM on Hadoop and Spark, and its scalability behaves better as well. (2) Response time and throughput of FLOS-ELM achieve better performance than OS-ELM on Hadoop and Spark when the incremental training samples arrive. (3) The response time and throughput of FLOS-ELM behave better in native-stream processing mode when the incremental data samples are continuously arriving.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.