MP-SMO: um algoritmo para a implementação VLSI do treinamento de máquinas de vetores de suporte.

Raúl Acosta Hernández

doi:10.11606/d.3.2009.tde-28102009-172855

Abstract

Learning Machines, like Artificial Neural Networks (ANNs), Bayesian Networks, Support Vector Machines (SVMs) and others are applied in pattern classification problems. As the test error in SVM is small, it has several applications, such as image recognition, gene selection, text classification, robotics, handwritten recognition and others. Among the developed algorithms for the SVM training, the Sequential Minimal Optimization (SMO) is one of the fastest and the simplest to implement in software. Due to its importance, many improvements have been proposed in order to obtain even faster solutions than the original algorithm. Most of the SVM training implementations are in software. However, in some applications with restrictions of: area, and/or power and/or training time, a hardware implementation is necessary, for example, in some mobile or portable applications. In related previous works, the SVMs were trained in hardware using sets of only tens of examples, and in only one implementation the SMO algorithm was employed. In this work, a modified version of the SMO algorithm, named here the Multiple Pairs SMO (MP-SMO) algorithm, for the SVM training acceleration is presented. The training time reduction is obtained optimizing per iteration one or more pairs of coefficients known as Lagrange Multipliers, instead of only one pair as in the original SMO algorithm. The MP-SMO algorithm has the following features: 1) the optimization of each pair is as simple as in the original SMO algorithm because of the use of the same analytical method. 2) the solution for the pairs of coefficients selection can be chosen between two adapted heuristics for the SMO algorithm. The algorithm was tested optimizing up to two, three and four pairs of coefficients per iteration, and the training time was improved, when compared against the SMO algorithm. The tests for seven benchmarks showed an improvement that ranged from 22.5% to 42.8%. The reduction of the training time of the SMO algorithm executed in hardware is also treated in this dissertation. The algorithms SMO and MP-SMO were completely implemented in dedicated hardware for the Tic-tac-toe endgame benchmark. This benchmark is composed of 958 examples, a number greater than the used in the previous hardware implementations. The implementation of the MP-SMO algorithm is intended to reduce the number of iterations, as in the software implementation, and to include parallelism in the hardware implementation. In order to reduce the iteration execution time, the pipeline and parallel architectures were realized. Sixteen different architectures were implemented and tested on a Field Programmable Gate Array (FPGA) device, combining or not the SMO or MP-SMO algorithm with pipelining and/or parallelism. The training time was reduced to 1.8% of that obtained with the SMO algorithm without neither pipelining nor parallelism, that is, more than 50 times. This dissertation also presents an analysis of the area and power cost of the training speed increase. LISTA DE ABREVIATURAS E SIGLAS ANN Artificial Neural Network ASIC Application Specific Integrated Circuit BSV Bluespec System Verilog CORDIC Coordinate Rotation Digital Computer F-heuristica Heuristica de Fan; Chen e Lin FPGA Field Programmable Gate Array JOP Java Optimized Processor K-heuristica Heuristica de Keerthi et al. KKT Karush-Kuhn-Tucker LUT Look Up Table MP-SMO Multiple Pairs-Sequential Minimal Optimization PDA Personal Digital Assistant QP Quadratic Programming RBF Radial Basis Function SMO Sequential Minimal Optimization SRAM Static Random Access Memory SVM Support Vector Machine UCI University of California at Irvine

Full Text