The performance of different probabilistic amplitude shaping (PAS)techniques in the nonlinear regime is investigated, highlighting its dependence on the PAS block length and the interaction with carrier phase recovery (CPR). Different PAS implementations are considered, based on different distribution matching (DM) techniques—namely, sphere shaping, shell mapping with different number of shells, and constant composition DM—and amplitude-to-symbol maps. When CPR is not included, PAS with optimal block length provides a nonlinear shaping gain with respect to a linearly optimized PAS (with infinite block length); among the considered DM techniques, the largest gain is obtained with sphere shaping. On the other hand, the nonlinear shaping gain becomes smaller, or completely vanishes, when CPR is included, meaning that in this case all the considered implementations achieve a similar performance for a sufficiently long block length. Similar results are obtained in different link configurations ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1\times 180$</tex-math></inline-formula> km, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$15\times 80$</tex-math></inline-formula> km, and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$27\times 80$</tex-math></inline-formula> km single-mode-fiber links), and also including laser phase noise, except when in-line dispersion compensation is used. Furthermore, we define a new metric, the nonlinear phase noise (NPN) metric, which is based on the frequency resolved logarithmic perturbation models and explains the interaction of CPR and PAS. We show that the NPN metric is highly correlated with the performance of the system. Our results suggest that, in general, the optimization of PAS in the nonlinear regime should always account for the presence of a CPR algorithm. In this case, the reduction of the rate loss (obtained by using sphere shaping and increasing the DM block length) turns out to be more important than the mitigation of the nonlinear phase noise (obtained by using constant-energy DMs and reducing the block length), the latter being already granted by the CPR algorithm.