The learning-based method has been widely applied in target localization. However, the performance of learning-based localization is mostly validated by simulation experiments and lacks theoretical analysis. In this paper, we investigate the performance of learning-based multi-target localization in a multiple-input multiple-output (MIMO) radar system with widely separated antennas. A mean squared error bound (MSEB) for learning-based multi-target localization is derived, which contains all the system parameters, including radar-related parameters and learning network-related parameters. Therefore, the MSEB can be employed as the metric to optimize any parameter of interest, such as those associated with the learning network architecture, which is usually determined with the trial-and-error method by minimizing the training loss. Leveraging the MSEB, we can design the optimal learning network architecture which maximizes the testing performance and this optimal architecture can be obtained before training. Numerical results are presented to verify the correctness of the MSEB and effectiveness of the MSEB-based learning network architecture design method.