High-symbol-rate coherentoptical transceivers suffer more from the critical responses of transceiver components at high frequency, especially when applying a higher order modulation format. We recently proposed a neural network (NN)-based digital pre-distortion (DPD) technique trained to mitigate the transceiver response of a 128 GBaud optical coherent transmission system. In this paper, we further detail this work and assess the NN-based DPD by training it using either a direct learning architecture (DLA) or an indirect learning architecture (ILA), and compare performance against a Volterra series-based ILA DPD and a linear DPD. Furthermore, we deliberately increase the transmitter nonlinearity and compare the performance of the three DPDs schemes. The proposed NN-based DPD trained using DLA performs the best among the three contenders. In comparison to a linear DPD, it provides more than 1 dB signal-to-noise ratio (SNR) gains at the output of a conventional coherent receiver DSP for uniform 64-quadrature amplitude modulation (QAM) and PCS-256-QAM signals. Finally, the NN-based DPD enables achieving a record 1.61 Tb/s net rate transmission on a single channel after 80 km of standard single mode fiber (SSMF).