A spatiotemporal phase modulator (STPM) is theoretically investigated using the vectorial diffraction theory. The STPM is equivalent to a time-dependent phase-only pupil filter that alternates between a homogeneous filter and a stripe-shaped filter with a sinusoidal phase distribution. It is found that two-photon focal modulation microscopy (TPFMM) using this STPM can significantly suppress the background contribution from out-of-focus ballistic excitation and achieve almost the same resolution as two-photon microscopy. The modulation depth is also evaluated and a compromise exists between the signal-to-background ratio and signal-to-noise ratio. The theoretical investigations provide important insights into future implementations of TPFMM and its potential to further extend the penetration depth of nonlinear microscopy in imaging multiple-scattering biological tissues.