Incoherent digital holography technology reduces the requirement for coherence of light sources, greatly expanding the application range of digital holography. In this paper, we designed a Multi-head attention single-pixel (MHASP) phase-shifting network for incoherent digital holography. The trained network has the capability to effortlessly predict three interferograms, encompassing phase shifts of 0, 2/3 π, and 4/3 π, solely from one-dimensional input data. Utilizing the conventional three-step phase-shifting method, we are able to effectively eliminate the DC and twin terms from the holographic reconstruction process, subsequently achieving a high-fidelity reconstruction facilitated by the employment of the back propagation algorithm. The comprehensive experimental findings clearly indicate that, beyond facilitating high-precision reconstruction, the introduced MHASP phase-shifting approach efficiently preserves 3D information through calibrating the back propagation distance, even when confronted with a reduced volume of holographic data. Furthermore, the introduced approach uses a network to replace the actual phase shift operation, which can better improve the utilization of object light energy. This approach not only circumvented the constraints posed by area array sensors but also facilitated high-fidelity imaging with minimal data volume, thereby expanding the horizons of incoherent digital holography applications in the realm of 3D imaging.