Perfusion MRI based on arterial spin labeling (ASL) has intrinsically very low signal-to-noise ratio (SNR). Signal acquisition at shorter echo times (TE) is necessary to boost the SNR of the ASL images. Spiral trajectories provide substantially shorter TE yielding increased SNR and are among the fastest k-space sampling schemes to encode a given field of view and resolution. Moreover, they provide approximately isotropic point-spread functions and inherent refocusing of motion- and flow-induced phase errors. However, the efficiency of the spiral acquisitions in ASL-MRI has been limited because these advantages are counterbalanced by practical technical challenges. This is because spiral acquisitions are highly sensitive to encoding deficiencies such as static off-resonance in the main magnetic field manifested as blurring artifacts in the image. Moreover, deviation of the gradient fields from the nominal waveforms due to the imperfection of the employed hardware critically limits the practical utilization of spiral trajectories. In this work, I provide single- and multiple-shot spiral ASL images that are robust against typical spiral encoding drawbacks enabled by deploying a comprehensive signal model involving static off-resonance and coil sensitivity maps and actual B0 and gradient field dynamics up to third order in space. The spiral ASL signal acquisition was concurrently monitored using a 3rd order dynamic field camera based on NMR field probes. The reconstructed ASL images at 3 mm and 2 mm in-plane resolution associating with the monitored field dynamics and the static off-resonances exhibited strongly reduced blurring- and aliasing artifacts and distortion. Concurrent field monitoring also enables to account for quasi-static B0 drifts by encompassing the parametric input data with consistent encoding geometry and physiological field fluctuations. In conclusion, concurrent field monitoring in spiral ASL acquisition largely overcomes traditional vulnerability of spiral trajectories in practice providing high quality ASL images with increased SNR, speed and motion robustness.