The vision of deploying miniature vehicles within the human body for intricate tasks holds tremendous promise across engineering and medical domains. Herein, optimal navigation of a cargo-towing swimmer under an applied zig-zag flow is studied by employing direct numerical simulations coupled with a deep reinforcement learning algorithm. Tasks include navigation in flow and shear-gradient directions. We initially explore combinations of state inputs, finding that optimal navigation necessitates swimmers to perceive hydrodynamics and alignment, surpassing reliance solely on hydrodynamic signals while considering their memories. Next, we study combinations of action spaces, allowing dynamic changes in swimming and/or rotational velocities by tuning B1 and C1 parameters of the squirmer model, respectively. By keeping both parameters fixed, cargo-towing swimmers demonstrate superior performance in the flow direction compared to swimmers without load due to tumbling movements influenced by shear flow. In the shear-gradient direction, swimmers without load outperform cargo-towing swimmers, with performance decreasing as load length increases. Across the combination of allowing B1 and C1 to change, the policies from solely dynamic B1 actions demonstrate superior navigation. The policies are then used as a showcase against naive cargo-towing and inert colloidal chains. A t-distributed stochastic neighbor embedding analysis reveals the complex interplay between perceived hydrodynamic signals and swimmer position. In the flow direction, swimmers align effectively with regions of maximum velocity, while in the shear-gradient direction, periodic transitions from minimum to maximum state values occur. Comparing pullers, pushers, and neutral swimmers, cargo-towing swimmers show a reversal in swimming velocity trends, with pullers outpacing neutral and pusher swimmers, irrespective of load lengths. Published by the American Physical Society 2024