ABSTRACT We discuss a neural network approach, which does not rely on dynamic programming techniques, to solve dynamic portfolio optimization problems subject to multiple investment constraints. The approach allows for objectives of a very general form encompassing both time-consistent and time-inconsistent objectives, as well as objectives requiring multi-level optimization. The number of parameters of the neural network remains independent of the number of portfolio rebalancing events. Compared to reinforcement learning, this technique avoids the computation of high-dimensional conditional expectations. The approach remains practical when considering large numbers of underlying assets, long investment time horizons or very frequent rebalancing events. We prove convergence of the numerical solution to the theoretical optimal solution of a large class of problems under fairly general conditions, and present ground truth analyses for a number of popular formulations, including mean-variance, mean-semi-variance, and mean-conditional value-at-risk problems. Numerical experiments show that if the investment objective functional is separable in the sense of dynamic programming, the correct time-consistent optimal investment strategy is recovered, otherwise we obtain the correct pre-commitment (time-inconsistent) investment strategy. This method is agnostic as to the underlying data generating assumptions, and results are illustrated using (i) parametric models for underlying asset returns, (ii) stationary block bootstrap resampling of empirical returns, and (iii) generative adversarial network (GAN)-generated synthetic asset returns.