In a multi-channel radiation detector readout system, waveform sampling, digitization, and raw data transmission to the data acquisition system constitute a conventional processing chain. The deposited energy on the sensor is estimated by extracting peak amplitudes, area under pulse envelopes from the raw data, and starting times of signals or time of arrivals. However, such quantities can be estimated using machine learning algorithms on the front-end Application-Specific Integrated Circuits (ASICs), often termed as “edge computing”. Edge computation offers enormous benefits, especially when the analytical forms are not fully known or the registered waveform suffers from noise and imperfections of practical implementations. In this work, we aim to predict peak amplitude from a single waveform snippet whose rising and falling edges containing only 3 to 4 samples. We thoroughly studied two well-accepted neural network algorithms, Multi-Layer Perceptron (MLP) and Convolutional Neural Network (CNN) by varying their model sizes. To better fit front-end electronics, neural network model reduction techniques, such as network pruning methods and variable-bit quantization approaches, were also studied. By combining pruning and quantization, our best performing model has the size of 1.5 KB, reduced from 16.6 KB of its full model counterpart. It can reach mean absolute error of 0.034 comparing to that of a naive baseline of 0.135. Such parameter-efficient and predictive neural network models established feasibility and practicality of their deployment on front-end ASICs.