ABSTRACT We developed convolutional neural networks (CNNs) to rapidly and directly infer the planet mass from radio dust continuum images. Substructures induced by young planets in protoplanetary discs can be used to infer the potential young planets’ properties. Hydrodynamical simulations have been used to study the relationships between the planet’s properties and these disc features. However, these attempts either fine-tuned numerical simulations to fit one protoplanetary disc at a time, which was time consuming, or azimuthally averaged simulation results to derive some linear relationships between the gap width/depth and the planet mass, which lost information on asymmetric features in discs. To cope with these disadvantages, we developed Planet Gap neural Networks (PGNets) to infer the planet mass from two-dimensional images. We first fit the gridded data in Zhang et al. as a classification problem. Then, we quadrupled the data set by running additional simulations with near-randomly sampled parameters, and derived the planet mass and disc viscosity together as a regression problem. The classification approach can reach an accuracy of 92 per cent, whereas the regression approach can reach 1σ as 0.16 dex for planet mass and 0.23 dex for disc viscosity. We can reproduce the degeneracy scaling α ∝ $M_\mathrm{ p}^3$ found in the linear fitting method, which means that the CNN method can even be used to find degeneracy relationship. The gradient-weighted class activation mapping effectively confirms that PGNets use proper disc features to constrain the planet mass. We provide programs for PGNets and the traditional fitting method from Zhang et al., and discuss each method’s advantages and disadvantages.