Logarithm-approximate floating-point multiplier is applicable to power-efficient neural network training
Logarithm-approximate floating-point multiplier is applicable to power-efficient neural network training
- Conference Article
12
- 10.1109/patmos.2019.8862162
- Jul 1, 2019
This paper proposes to adopt logarithm-approximate multiplier (LAM) for multiply-accumulate (MAC) computation in neural network (NN) training engine, where LAM approximates a floating-point multiplication as an addition resulting in smaller delay, fewer gates, and lower power consumption. Our implementation of NN training engine for a 2-D classification dataset achieves 10% speed-up and 2.5X and 2.3X efficiency improvement in power and area, respectively. LAM is also highly compatible with conventional bit-width scaling (BWS). When BWS is applied with LAM in four test datasets, more than 5.2X power efficiency improvement is achievable with only 1% accuracy degradation, where 2.3X improvement originates from LAM.
- Research Article
- 10.30917/att-vk-1814-9588-2023-1-4
- Feb 1, 2023
- Veterinaria i kormlenie
The purpose of the research, the results of which are presented in this article, is to determine the possibility and evaluate the effectiveness of using a trained neural network in the diagnosis of ringworm. The article provides an analysis of the methods used for diagnosing dermatomycosis in veterinary practice. One of the actively developing areas at present is the use of artificial neural networks in the diagnosis of animal diseases. The authors have developed a method for diagnosing dermatophytosis using a trained neural network. To identify hair damaged by dermatophyte spores in cats, a trained artificial neural network YOLO v5 was used, based on the YOLO architecture (high-precision artificial neural network), which provides high accuracy and speed of object detection in images. Diagnostics was carried out in three stages. The first stage: the diagnosis of hair in cats damaged by dermatophyte spores was carried out using a trained artificial neural network. The second stage: microscopy by a veterinary specialist of the veterinary center. The third stage: comparison of the received data from the trained artificial neural network and veterinary specialists. Three comparative experiments were carried out on 20 depersonalized samples with different ratios from healthy and sick animals. As a result of testing the trichoscopy method using artificial neural networks for diagnosing spore-damaged hair dermatitis in cats, it was found that a trained artificial neural network of 60 studied samples diagnosed dermatophyte spore damage in 20 samples, a veterinarian - in 17. All positive results were confirmed by a mycological laboratory study. and identification of the pathogen. It has been established that the use of a trained artificial neural network increases the diagnostic efficiency by 15% and reduces the time to perform diagnostic microscopy by 60.3%. The application of the proposed method allows to reduce the time of microscopic examination, improve the accuracy of interpretation of the results, automate methods for identifying causative agents of ringworm in small animals and take timely measures to treat the animal.
- Research Article
13
- 10.3390/a14040107
- Mar 28, 2021
- Algorithms
The accurate of i identificationntrinsically disordered proteins or protein regions is of great importance, as they are involved in critical biological process and related to various human diseases. In this paper, we develop a deep neural network that is based on the well-known VGG16. Our deep neural network is then trained through using 1450 proteins from the dataset DIS1616 and the trained neural network is tested on the remaining 166 proteins. Our trained neural network is also tested on the blind test set R80 and MXD494 to further demonstrate the performance of our model. The MCC value of our trained deep neural network is 0.5132 on the test set DIS166, 0.5270 on the blind test set R80 and 0.4577 on the blind test set MXD494. All of these MCC values of our trained deep neural network exceed the corresponding values of existing prediction methods.
- Conference Article
2
- 10.1145/3613424.3623779
- Oct 28, 2023
Neural network training is inherently sequential where the layers finish the forward propagation in succession, followed by the calculation and back-propagation of gradients (based on a loss function) starting from the last layer. The sequential computations significantly slow down neural network training, especially the deeper ones. Prediction has been successfully used in many areas of computer architecture to speed up sequential processing. Therefore, we propose ADA-GP, which uses gradient prediction adaptively to speed up deep neural network (DNN) training while maintaining accuracy. ADA-GP works by incorporating a small neural network to predict gradients for different layers of a DNN model. ADA-GP uses a novel tensor reorganization method to make it feasible to predict a large number of gradients. ADA-GP alternates between DNN training using backpropagated gradients and DNN training using predicted gradients. ADA-GP adaptively adjusts when and for how long gradient prediction is used to strike a balance between accuracy and performance. Last but not least, we provide a detailed hardware extension in a typical DNN accelerator to realize the speed up potential from gradient prediction. Our extensive experiments with fifteen DNN models show that ADA-GP can achieve an average speed up of 1.47 × with similar or even higher accuracy than the baseline models. Moreover, it consumes, on average, 34% less energy due to reduced off-chip memory accesses compared to the baseline accelerator.
- Conference Article
34
- 10.1109/3ict.2018.8855743
- Nov 1, 2018
Artificial neural networks (ANN) have been widely used in the field of data classification. Normally, training of neural network is applied with the traditional back propagation technique. As, this approach has various drawbacks, training of neural network is done with Particle Swarm Optimization (PSO). PSO has been widely used to solve the diverse kind of optimization problems. Population initialization performs a significant role in meta-heuristic algorithms. This paper describes a new initialization population approach Log Logistic termed as PSOLL-NN to create the initialization of the swarm. The proposed algorithm has been tested for weight optimization of feed forward neural network; and compared with back propagation Algorithm (BPA), standard PSO (PSO-NN), PSO initialized with Halton Sequence (PSOH-NN), Torus sequence (PSOT-NN) and Sobol sequence (PSOS-NN). The experimental results show that the proposed technique performed exceptionally better than the other traditional techniques. Moreover, the outcome of our work presents a foresight that how the proposed initialization technique can be used as an efficient alternative to standard training approaches for the data classification problems.
- Conference Article
21
- 10.1109/icdcece53908.2022.9792645
- Apr 23, 2022
Training of a neural network is easier than it goes deeper. Deeper architecture makes neural networks more difficult to train because of vanishing gradient and complexity problems, and via this training, deeper neural networks become much time taking and high utilization of computer resources. Introducing residual blocks in neural networks train specifically deeper architecture networks than those used previously. Residual networks gain this achievement by attaching a trip connection to the layers of artificial neural networks. This paper is about showing residual networks and how they work like formulas, we will see residual networks obtain good accuracy, and as well as the model is easier to optimize because Res Net makes training of large structured neural networks more efficient. We will check residual nets on the Image Net dataset with a depth of 152 layers which is 8x more intense than VGG nets yet very less complex. After building this architecture of residual nets gets error up to 3.57% on the Image Net test dataset. We also compare the Res Net result to its equivalent Convolutional Network without residual connection. Our results show that ResNet provides higher accuracy but apart from that, it is more prone to over fitting. Stochastic augmentation of training datasets and adding dropout layers in networks are some of the over fitting prevention methods.
- Conference Article
15
- 10.1109/arith54963.2022.00010
- Sep 1, 2022
Low-precision formats have recently driven major breakthroughs in neural network (NN) training and inference by reducing the memory footprint of the NN models and improving the energy efficiency of the underlying hardware architectures. Narrow integer data types have been vastly investigated for NN inference and have successfully been pushed to the extreme of ternary and binary representations. In contrast, most training-oriented platforms use at least 16-bit floating-point (FP) formats. Lower-precision data types such as 8-bit FP formats and mixed-precision techniques have only recently been explored in hardware implementations. We present MiniFloat-NN, a RISC-V instruction set architecture extension for low-precision NN training, providing support for two 8-bit and two 16-bit FP formats and expanding operations. The extension includes sum-of-dot-product instructions that accumulate the result in a larger format and three-term additions in two variations: expanding and non-expanding. We implement an ExSdotp unit to efficiently support in hardware both instruction types. The fused nature of the ExSdotp module prevents precision losses generated by the non-associativity of two consecutive FP additions while saving around 30% of the area and critical path compared to a cascade of two expanding fused multiply-add units. We replicate the ExSdotp module in a SIMD wrapper and integrate it into an open-source floating-point unit, which, coupled to an open-source RISC-V core, lays the foundation for future scalable architectures targeting low-precision and mixed-precision NN training. A cluster containing eight extended cores sharing a scratchpad memory, implemented in 12 nm FinFET technology, achieves up to 575 GFLOPS/W when computing FP8-to-FP16 GEMMs at 0.8 V, 1.26 GHz.
- Research Article
42
- 10.1016/j.jco.2020.101540
- Nov 27, 2020
- Journal of Complexity
Non-convergence of stochastic gradient descent in the training of deep neural networks
- Conference Article
14
- 10.1145/3404397.3404408
- Aug 17, 2020
Deep neural networks (DNNs) have gained considerable attention in various real-world applications due to the strong performance on representation learning. However, a DNN needs to be trained many epochs for pursuing a higher inference accuracy, which requires storing sequential versions of DNNs and releasing the updated versions to users. As a result, large amounts of storage and network resources are required, significantly hampering DNN utilization on resource-constrained platforms (e.g., IoT, mobile phone). In this paper, we present a novel delta compression framework called Delta-DNN, which can efficiently compress the float-point numbers in DNNs by exploiting the floats similarity existing in DNNs during training. Specifically, (1) we observe the high similarity of float-point numbers between the neighboring versions of a neural network in training; (2) inspired by delta compression technique, we only record the delta (i.e., the differences) between two neighboring versions, instead of storing the full new version for DNNs; (3) we use the error-bounded lossy compression to compress the delta data for a high compression ratio, where the error bound is strictly assessed by an acceptable loss of DNNs’ inference accuracy; (4) we evaluate Delta-DNN’s performance on two scenarios, including reducing the transmission of releasing DNNs over the network and saving the storage space occupied by multiple versions of DNNs. According to experimental results on six popular DNNs, Delta-DNN achieves the compression ratio 2 × -10 × higher than state-of-the-art methods, while without sacrificing inference accuracy and changing the neural network structure.
- Research Article
3
- 10.1007/s12239-019-0128-2
- Nov 1, 2019
- International Journal of Automotive Technology
Surface deflection is a phenomenon that causes fine wrinkles on the outer surfaces of sheet metal and deteriorates product external appearance. It is quantitatively defined as the difference between the section curve of the sheet and the ideal curve. In this study, using neural networks, a prediction model for surface deflection according to material properties was constructed and combined with a genetic algorithm; the combination of the material properties was studied to predict the minimum surface deflection. Because of the limited number of simulation data, neural networks were developed using several sampling methods such as central composite design, Latin hypercube sampling, and random sampling. In the training of the neural networks, the optimal hyper-parameter of the neural network was found automatically using Latin hypercube sampling. In conclusion, for prediction of surface deflection in rectangular embossing, neural networks made by central composite design showed the best performance. In addition, it was confirmed that the procedure of combining automatic training of a neural network and the genetic algorithm accurately predicted the set of material properties that generates the minimum surface deflection. Also, the quantity of surface deflection predicted by the neural network was very close to that predicted by finite element analysis.
- Research Article
3
- 10.3390/math11010164
- Dec 28, 2022
- Mathematics
Approaches presented today in the scientific literature suggest that there are no methodological solutions based on the training of artificial neural networks to predict the direction of industrial development, taking into account a set of factors—innovation, environmental friendliness, modernization and production growth. The aim of the study is to develop a predictive model of performance management of innovative industrial systems by building neural networks. The research methods were correlation analysis, training of neural networks (species—regression), extrapolation, and exponential smoothing. As a result of the research, the estimation efficiency technique of an innovative industrial system in a complex considering the criteria of technical modernization, development, innovative activity, and ecologization is developed; the prognostic neural network models allow to optimize the contribution of signs to the formation of target (set) values of indicators of efficiency for macro and micro-industrial systems that will allow to level a growth trajectory of industrial systems; the priority directions of their development are offered. The following conclusions: the efficiency of industrial systems is determined by the volume of sales of goods, innovative products and waste recycling, which allows to save resources; the results of forecasting depend significantly on the DataSet formulated. Although multilayer neural networks independently select important features, it is advisable to conduct a correlation analysis beforehand, which will provide a higher probability of building a high-quality predictive model. The novelty of the research lies in the development and testing of a unique methodology to assess the effectiveness of industrial systems: it is based on a multidimensional system approach (takes into account factors of innovation, environmental friendliness, modernization and production growth); it combines a number of methodological tools (correlation, ranking and weighting); it expands the method of effectiveness assessment in terms of the composition of variables (previously presented approaches are limited to the aspects considered).
- Research Article
- 10.5937/grmk1501003k
- Jan 1, 2015
- Gradjevinski materijali i konstrukcije
In present paper, concrete compressive strength is evaluated using back propagation feed-forward artificial neural network. Training of neural network is performed using Levenberg-Marquardt learning algorithm for four architectures of artificial neural networks, one, three, eight and twelve nodes in a hidden layer in order to avoid the occurrence of overfitting. Training, validation and testing of neural network is conducted for 75 concrete samples with distinct w/c ratio and amount of superplasticizer of melamine type. These specimens were exposed to different number of freeze/thaw cycles and their compressive strength was determined after 7, 20 and 32 days. The obtained results indicate that neural network with one hidden layer and twelve hidden nodes gives reasonable prediction accuracy in comparison to experimental results (R=0.965, MSE=0.005). These results of the performed analysis are further confirmed by calculating the standard statistical errors: the chosen architecture of neural network shows the smallest value of mean absolute percentage error (MAPE=, variance absolute relative error (VARE) and median absolute error (MEDAE), and the highest value of variance accounted for (VAF).
- Research Article
3
- 10.3389/frai.2024.1368569
- Jun 21, 2024
- Frontiers in artificial intelligence
The training of neural networks (NNs) is a computationally intensive task requiring significant time and resources. This article presents a novel approach to NN training using adiabatic quantum computing (AQC), a paradigm that leverages the principles of adiabatic evolution to solve optimization problems. We propose a universal AQC method that can be implemented on gate quantum computers, allowing for a broad range of Hamiltonians and thus enabling the training of expressive neural networks. We apply this approach to various neural networks with continuous, discrete, and binary weights. The study results indicate that AQC can very efficiently evaluate the global minimum of the loss function, offering a promising alternative to classical training methods.
- Research Article
- 10.17816/dd627076
- Jul 3, 2024
- Digital Diagnostics
BACKGROUND: Currently, artificial intelligence in the form of artificial neural networks is being actively implemented in a number of areas of our lives, including medicine. In particular, in otorhinolaryngology, artificial neural networks are used to analyze images obtained during endoscopic examinations of patients (e.g., videolaryngoscopy) [1–3]. The interpretation of laryngoscopic images often presents significant difficulties for practicing physicians, which reduces the frequency of detection of precancerous laryngeal diseases and contributes to the increase in the number of patients with stage III–IV laryngeal cancer [4, 5]. This underscores the significance of prompt performance and accurate interpretation of the findings of endoscopic examinations of patients with laryngeal disorders. Artificial neural networks can be employed to analyze the results of videolaryngoscopy, furnishing the physician with supplementary information that can enhance diagnostic accuracy and diminish the probability of error [6, 7]. AIM: The study aims to develop and train an artificial neural network for recognizing characteristic features of laryngeal neoplasms and variants of laryngeal normality. MATERIALS AND METHODS: The study was conducted under the grant of the Moscow Center for Innovative Technologies in Healthcare (grant No. 2112-1/22) entitled “Using Neural Networks (Artificial Intelligence Algorithms) for Control and Improving the Quality of Diagnosis and Treatment of Diseases of Laryngeal and Ear Structures through Digital Technologies”.The following methods were used during the course of the study: data collection for the creation of a photobank (dataset) of medical images obtained during videolaryngoscopy; data partitioning for the formation of datasets for individual nosologies and groups of diseases; the method of consilium; analysis of the accuracy of recognition and classification of digital endoscopic images; and training of classification neural networks. Consequently, a dataset comprising 1,471 laryngeal images in digital formats (JPEG, BMP) was assembled, labelled, and uploaded for the purpose of training the artificial neural network. Of the total number of images, 410 were classified as pertaining to laryngeal formation, while 1061 were classified as variants of normality. Subsequently, the neural network was trained and tested to identify the signs of normal and laryngeal masses. RESULTS: The results of the testing of the artificial neural network indicated the formation of an inaccuracy matrix, the calculation of the value of recognition accuracy, the calculation of the quality indicators of the model performance, and the construction of the ROC curve. The developed and trained artificial neural network demonstrated an accuracy of 86% in recognizing the signs of laryngeal masses and norms. CONCLUSIONS: This study demonstrates that a trained artificial neural network can successfully distinguish between signs of normal and laryngeal masses in endoscopic photographs. With further training of the neural network and achievement of high accuracy, this technology can be used in clinical practice as an assistant in the interpretation of laryngoscopic images and early diagnosis of laryngeal masses. It can also be employed to control and improve the quality of diagnosis and treatment of diseases of the throat, nose, and ears by primary care physicians.
- Research Article
- 10.1038/s42005-025-02384-8
- Nov 26, 2025
- Communications Physics
Training of neural networks (NNs) has emerged as a major consumer of both computational and energy resources. Quantum computers were coined as a root to facilitate training, but no experimental evidence has been presented so far. Here we demonstrate that quantum annealing platforms, such as D-Wave, can enable fast and efficient training of classical NNs, which are then deployable on conventional hardware. From a physics perspective, NN training can be viewed as a dynamical phase transition: the system evolves from an initial spin glass state to a highly ordered, trained state. This process involves eliminating numerous undesired minima in its energy landscape. The advantage of annealing devices is their ability to rapidly find multiple deep states. We found that this quantum training achieves superior performance scaling compared to classical backpropagation methods, with a clearly higher scaling exponent (1.01 vs. 0.78). It may be further increased up to a factor of 2 with a fully coherent quantum platform using a variant of the Grover algorithm. Furthermore, we argue that even a modestly sized annealer can be beneficial to train a deep NN by being applied sequentially to a few layers at a time.