Obtaining accurate ocean sound speed fields (SSFs) across a three-dimensional (3D) geographic region is vital for various underwater acoustic tasks. However, the scarcity of measurements due to the high cost of underwater sensors, combined with the high dimensionality of complex 3D SSF, makes the reconstruction problem highly ill-conditioned, thus demanding advanced models and methods. Our recent work has analyzed the reconstruction error and identified one promising way: finding a representation model that is both concise and expressive. Following this path, we proposed a tensor neural network (TNN) model, which leverages the conciseness of tensor models and the expressive power of deep learning. However, existing TNN-based approaches have two limitations: (1) they are unable to capture long-range correlations within the SSF and (2) they can only handle discrete-indexed tensor data. To overcome these limitations and fully unleash the power of deep tensor learning, we seamlessly introduce attention schemes into the existing TNN framework without compromising its interpretability. Additionally, we employ Gaussian process models to evolve the original parameterized tensor model into a new functional tensor model, enabling reconstruction with a continuous grid. Numerical results obtained from real-life datasets demonstrate the superior performance of our approach compared to state-of-the-art methods.