Abstract

One of the challenges to data mining raised by technology development is that both data size and dimensionality is growing rapidly. K-means, one of the most popular clustering algorithms in data mining, suffers in computational time when used for large data sets and data with high dimensionality. In this paper, we propose a hardware architecture for K-means with triangle inequality optimization on FPGA. An optimal 8-bit square calculator for 6-LUT architectures is described to minimize the hardware cost and an approximation solution is proposed to avoid square root calculation in the original triangle inequality optimization. Our software and hardware experiments are tested with the MNIST benchmark and uniform random numbers of various size. This approximation results in 2% more distance calculations for MNIST and 5% for uniform random numbers than the original optimization. Compared to the baseline hardware system without optimization, our approach achieves up to 77% improvement in processing time with about 10% logic overhead. We demonstrate that the hardware can achieve 55-fold speed up compared to software for the 1024 MNIST.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call