We propose a kinematic wave-based Deep Convolutional Neural Network (Deep CNN) to estimate high-resolution traffic speed fields from sparse probe vehicle trajectories. We introduce two key approaches that allow us to incorporate kinematic wave theory principles to improve the robustness of existing learning-based estimation methods. First, we propose an anisotropic traffic kernel for the Deep CNN. The anisotropic kernel explicitly accounts for space-time correlations in macroscopic traffic and effectively reduces the number of trainable parameters in the Deep CNN model. Second, we propose to use simulated data for training the Deep CNN. Using a targeted simulated data for training provides an implicit way to impose desirable traffic physical features on the learning model. In the experiments, we highlight the benefits of using anisotropic kernels and evaluate the transferability of the trained model to real-world traffic using the Next Generation Simulation (NGSIM) and the German Highway Drone (HighD) datasets. The results demonstrate that anisotropic kernels significantly reduce model complexity and model over-fitting, and improve the physical correctness of the estimated speed fields. We find that model complexity scales linearly with problem size for anisotropic kernels compared to quadratic scaling for isotropic kernels. Furthermore, evaluation on real-world datasets shows acceptable performance, which establishes that simulation-based training is a viable surrogate to learning from real-world data. Finally, a comparison with standard estimation techniques shows the superior estimation accuracy of the proposed method.