Artificial intelligence (AI)-based surrogate reservoir models (SRMs) can provide computationally feasible and accurate approximations to numerical simulations. An AI-based SRM is trained to a set of parameters that significantly reduces its variance. This can be done by either supervised or semi-supervised learning. The latter involves regularization of the model’s parameters using non-physics-based, physics-based or a combination of both regularization terms.Effective enforcement of the physics-based and non-physics-based regularizations can significantly reduce the variance of AI-based SRMs. Little study has been reported on the application and effects of regularization terms. Also, for highly compressible subsurface flow where strong nonlinearities exist, well-constructed composite AI-based architectures and regularizations are necessary for learning.This paper applies and studies the effects of various regularization terms for highly compressible subsurface flow; it proposes unique and effective techniques in AI-based surrogate development and training. The learning utilizes the discretized domain and boundary physics with derivatives obtained from both finite difference methods (FDM) and algorithmic differentiation (AD). The regularizations are partly enforced as a hard constraint in the network architecture using a trainable layer and as soft constraints using a multi-cost function. The soft constraints exploit a tank material balance and time-discretization numerical errors, in addition to the domain, boundary and non-physics-based L2 regularization terms. The timely-trained AI-based surrogate predictions agree with those obtained from a numerical simulator.The regularization terms separately contribute to improved learning. The non-physics-based L2 norm if used in the right order of magnitude, improves grid block predictions. The tank material balance regularization term constrains the AI-based surrogate parameters to net domain accumulation, ensuring its reliability. The trainable hard enforcement layer enforces the initial condition and improves the predictions compared to other hard enforcement techniques. The discretized domain equation and time-discretization numerical errors allow for learning of variable timesteps, which give the best rounding-truncation error trade-off and improve the predictions compared to those of fixed timesteps. The AI-based surrogate, effectively trained by semi-supervision, can be reliably used as a state-dependent model in domain analysis like sensitivity and data assimilation.