Modern intra-data center (IDC) interconnects leverage robust and low-cost intensity modulation (IM) and direct detection (DD) optical links, based on multimode fibers (MMFs) and vertical-cavity surface-emitting lasers (VCSELs). Current solutions, based on on-off keying (OOK) modulations, reach up to 25-50 Gbps per lane over nearly 100 meters. The actual target for IDCs is to increase VCSEL-MMF links capacity up to 100 Gbps, using PAM-4 on the same devices. To counteract the consequent linear and nonlinear distortions affecting the transmitted signals, an effective solution is to exploit digital signal processing (DSP). In this manuscript, we propose a novel method to optimize a nonlinear artificial neural network (ANN) digital pre-distorter (DPD), based on End-to-end (E2E) learning, that, trained jointly with a Feed-Forward Equalizer (FFE), fulfills physical amplitude constraints and handles different ratio between the sampling rates incurring along with an optical IM-DD system. We indeed propose an E2E ANN system operating simultaneously at different sampling frequencies. We moreover propose in our training method a substitution to the time-domain injection of the receiver noise in the system with an additive regularization term in the FFE gradient loss. We experimentally show the advantages of our proposed DPD comparing the bit error rate (BER) performance against the same scenario without DPD. We assess the gain in terms of Gross Bit Rate and Optical Path Loss (OPL), at given BER targets, for different fiber lengths.