This paper proposes a new method for joint design of radiofrequency (RF) and gradient waveforms in Magnetic Resonance Imaging (MRI), and applies it to the design of 3D spatially tailored saturation and inversion pulses. The joint design of both waveforms is characterized by the ODE Bloch equations, to which there is no known direct solution. Existing approaches therefore typically rely on simplified problem formulations based on, e.g., the small-tip approximation or constraining the gradient waveforms to particular shapes, and often apply only to specific objective functions for a narrow set of design goals (e.g., ignoring hardware constraints). This paper develops and exploits an auto-differentiable Bloch simulator to directly compute Jacobians of the (Bloch-simulated) excitation pattern with respect to RF and gradient waveforms. This approach is compatible with arbitrary sub-differentiable loss functions, and optimizes the RF and gradients directly without restricting the waveform shapes. For computational efficiency, we derive and implement explicit Bloch simulator Jacobians (approximately halving computation time and memory usage). To enforce hardware limits (peak RF, gradient, and slew rate), we use a change of variables that makes the 3D pulse design problem effectively unconstrained; we then optimize the resulting problem directly using the proposed auto-differentiation framework. We demonstrate our approach with two kinds of 3D excitation pulses that cannot be easily designed with conventional approaches: Outer-volume saturation (90° flip angle), and inner-volume inversion.