Massive multiple-input multiple-output (mMIMO) is a critical component in upcoming 5G wireless deployment as an enabler for high data rate communications. mMIMO is effective when each corresponding antenna pair of the respective transmitter-receiver arrays experiences an independent channel. While increasing the number of antenna elements increases the achievable data rate, at the same time computing the channel state information (CSI) becomes prohibitively expensive. In this article, we propose to use deep learning via a multi-layer perceptron architecture that exceeds the performance of traditional CSI processing methods like least square (LS) and linear minimum mean square error (LMMSE) estimation, thus leading to a beyond fifth generation (B5G) networking paradigm wherein machine learning fully drives networking optimization. By computing the CSI of all pairwise channels simultaneously via our deep learning approach, our method scales with large antenna arrays as opposed to traditional estimation methods. The key insight here is to design the learning architecture such that it is implementable on massively parallel architectures, such as GPU or FPGA. We validate our approach by simulating a 32-element array base station and a user equipment with a 4-element array operating on millimeter-wave frequency band. Results reveal an improvement up to five and two orders of magnitude in BER with respect to fastest LS estimation and optimal LMMSE, respectively, substantially improving the end-to-end system performance and providing higher spatial diversity for lower SNR regions, achieving up to 4 dB gain in received power signal compared to performance obtained through LMMSE estimation.