The conventional wisdom of simple models in machine learning misses the bigger picture, especially over-parameterized neural networks (NNs), where the number of parameters are much larger than the number of training data. Our goal is to explore the mystery behind over-parameterized models from a theoretical side. In this talk, I will discuss the role of over-parameterization in neural networks, to theoretically understand why they can perform well. First, I will discuss the role of over-parameterization in neural networks from the perspective of models, to theoretically understand why they can genralize well. Second, the effects of over-parameterization in robustness, privacy are discussed. Third, I will talk about the over-parameterization from kernel methods to neural networks in a function space theory view. Besides, from classical statistical learning to sequential decision making, I will talk about the benefits of over-parameterization on how deep reinforcement learning works well for function approximation. Potential future directions on theory of over-parameterization ML will also be discussed.