Modeling plasma accelerators is a computationally challenging task and the quasi-static particle-in-cell algorithm is a method of choice in a wide range of situations. In this work, we present the first performance-portable, quasi-static, three-dimensional particle-in-cell code HiPACE++. By decomposing all the computation of a 3D domain in successive 2D transverse operations and choosing appropriate memory management, HiPACE++ demonstrates orders-of-magnitude speedups on modern scientific GPUs over CPU-only implementations. The 2D transverse operations are performed on a single GPU, avoiding time-consuming communications. The longitudinal parallelization is done through temporal domain decomposition, enabling near-optimal strong scaling from 1 to 512 GPUs. HiPACE++ is a modular, open-source code enabling efficient modeling of plasma accelerators from laptops to state-of-the-art supercomputers. Program summaryProgram Title: HiPACE++CPC Library link to program files:https://doi.org/10.17632/zh3rc7hvrm.1Developer's repository link:HiPACE++ GitHub repositoryLicensing provisions: BSD 3-clauseProgramming language: C++Nature of problem: Modeling plasma accelerators is a computationally challenging task requiring nanometer-scale resolutions over meter-scale propagation distances. The quasi-static particle-in-cell method enables high-fidelity simulations of this strongly non-linear process, but these simulations can be very expensive.Solution method: The quasi-static particle-in-cell algorithm is modified to enable efficient utilization of accelerated hardware, in particular with GPU computing, reducing the cost of simulations by orders of magnitude. A novel longitudinal parallelization enables excellent strong scaling of this method up to hundreds of GPUs.Reference:10.5281/zenodo.5358483