In this paper, we propose a novel technique for the automatic design of Artificial Neural Networks (ANNs) by evolving to the optimal network configuration(s) within an architecture space. It is entirely based on a multi-dimensional Particle Swarm Optimization (MD PSO) technique, which re-forms the native structure of swarm particles in such a way that they can make inter-dimensional passes with a dedicated dimensional PSO process. Therefore, in a multidimensional search space where the optimum dimension is unknown, swarm particles can seek both positional and dimensional optima. This eventually removes the necessity of setting a fixed dimension a priori, which is a common drawback for the family of swarm optimizers. With the proper encoding of the network configurations and parameters into particles, MD PSO can then seek the positional optimum in the error space and the dimensional optimum in the architecture space. The optimum dimension converged at the end of a MD PSO process corresponds to a unique ANN configuration where the network parameters (connections, weights and biases) can then be resolved from the positional optimum reached on that dimension. In addition to this, the proposed technique generates a ranked list of network configurations, from the best to the worst. This is indeed a crucial piece of information, indicating what potential configurations can be alternatives to the best one, and which configurations should not be used at all for a particular problem. In this study, the architecture space is defined over feed-forward, fully-connected ANNs so as to use the conventional techniques such as back-propagation and some other evolutionary methods in this field. The proposed technique is applied over the most challenging synthetic problems to test its optimality on evolving networks and over the benchmark problems to test its generalization capability as well as to make comparative evaluations with the several competing techniques. The experimental results show that the MD PSO evolves to optimum or near-optimum networks in general and has a superior generalization capability. Furthermore, the MD PSO naturally favors a low-dimension solution when it exhibits a competitive performance with a high dimension counterpart and such a native tendency eventually yields the evolution process to the compact network configurations in the architecture space rather than the complex ones, as long as the optimality prevails.