We have developed a neural network-based pipeline to estimate masses of galaxy clusters with a known redshift directly from photon information in X-rays. Our neural networks were trained using supervised learning on simulations of eROSITA observations, focusing on the Final Equatorial Depth Survey (eFEDS). We used convolutional neural networks that have been modified to include additional information on the cluster, in particular, its redshift. In contrast to existing works, we utilized simulations that include background and point sources to develop a tool that is directly applicable to observational eROSITA data for an extended mass range – from group size halos to massive clusters with masses in between 1013 M⊙ < M < 1015 M⊙. Using this method, we are able to provide, for the first time, neural network mass estimations for the observed eFEDS cluster sample from Spectrum-Roentgen-Gamma/eROSITA observations and we find a consistent performance with weak-lensing calibrated masses. In this measurement, we did not use weak-lensing information and we only used previous cluster mass information, which was used to calibrate the cluster properties in the simulations. When compared to the simulated data, we observe a reduced scatter with respect to luminosity and count rate based scaling relations. We also comment on the application for other upcoming eROSITA All-Sky Survey observations.