Machine Learning (ML) has been applied in so many areas in reason of its robustness, usability, and reliability, mainly in hardware implementation. One of its well-known algorithms is the Support Vector Machine (SVM), the simplest to be applied in hardware because of its mathematical modeling. In this study, we propose the implementation in hardware of SVM multi-class classifiers within the asynchronous paradigm (i.e., without clock signal application) in a 4-stage pipeline architecture. With the purpose to evaluate the proposed architecture behavior, we used a Field Programmable Gate Array (FPGA) device to prototype the circuit. The proposed asynchronous SVM classifier is applied in a Speech Recognition system of 30 classes where a hybrid training algorithm was processed in the Matlab software, known as PSO-SVM training algorithm. Therefore, the training phase was processed in software in reason of its computational load. For the SVM classification phase, we propose, for the first time to the best of our knowledge, an asynchronous pipeline architecture of four stages with Multiply-Accumulator (MAC) unit application and three different control circuits described from Extended Burst-Mode (XBM) and State Transition Graph (STG) specifications, leading to energy-efficient design. In order to validate the SVM recognition results in the speech recognition application, the tests are from 60 speeches and 20 speakers, so it is a diversified and reliable data set of tests. The main goal here was to design a machine learning hardware implementation in a low power application and, through that, to prove that the asynchronous paradigm reduced power. As a result, we obtained a reduced power consumption of 5.72 mW, a fast average response time which was 0.61μs and the most area-efficient circuit (1315 LUTs); the accuracy in recognition success rate was another preoccupation, and it was very successful, 98% of success. Besides, we present comparisons with an asynchronous version of the same SVM datapath and with different synchronous architectures from literature to prove that our novelty is better in power consumption and area size. For hardware applications where low power and high performance are the sought features, the presented architecture revealed the best position when compared to similar works from the recent technical literature for pattern recognition systems.