This article presents a fast full-chip electromigration (EM) aware IR drop constrained optimization framework, named GridNetOpt, for on-chip power grid networks accelerated by deep neural networks (DNN). Compared to the existing linear programming-based methods, the new method employs more flexible conjugate gradient-based optimization to size the wire width of the power grids. To mitigate the high cost of sensitivity calculation of the adjoint network using full-chip IR drop analysis at every iteration step, the sensitivity is computed via a trained conditional generative adversarial network (CGAN). The new method exploits the differentiable characteristics of DNNs for fast sensitivity computation. The sensitivity, which is the node voltage with respect to wire resistance, will guide the search direction during the optimization process. In order to consider more accurate EM failure effects, the training data is obtained from the power grids under different wire widths and current loads analyzed by a state-of-the-art full-chip multi-physics-based coupled EM-IR drop analysis tool. This is in contrast with the existing linear programming-based methods, in which only immortal wires or wires with non-zero resistance can be dealt with. Numerical results on a number of synthesized power grid benchmarks from ARM Cortex-M0 processor designs show that the proposed GridNetOpt can lead to at least an order of magnitude speedup over the conjugate gradient-based method using the traditional adjoint network method. Compared to the previous localized power grid fixing work with GridNet, GridNetOpt leads to smaller area overhead for all the benchmarks we tested. It can also reduce IR drops for power grid circuits with immortal wires, which is not possible with the localized GridNet method.