This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, including NVIDIA GPUs, x86-based Intel/AMD, and Arm-based systems. The goal is to determine whether these models can provide developers with performance-portable solvers running efficiently on various architectures. The paper forms a more holistic view of a high-order solver across multiple platforms by visualizing performance portability (PP) and measuring productivity. It gives guidelines for translating existing codebases and their data structures to these models.