Traditional numerical discretization-based solvers of partial differential equations (PDEs) are fundamentally agnostic to domains, boundary conditions and coefficients. In contrast, machine learnt solvers have a limited generalizability across these elements of boundary value problems. This is strongly true in the case of surrogate models that are typically trained on direct numerical simulations of PDEs applied to one specific boundary value problem. In a departure from this direct approach, the label-free machine learning of solvers is centered on a loss function that does not use computed field solutions as labels. Instead, the PDE and boundary conditions are directly incorporated in residual form to express the loss function during training. However, their generalization across boundary conditions is limited and they remain strongly domain-dependent. Here, we present a framework that generalizes across domains, boundary conditions and coefficients while simultaneously learning the PDE in weak form. Our work explores the ability of convolutional neural network (CNN)-based encoder–decoder architectures to learn to solve a PDE in greater generality than its restriction to a particular boundary value problem. In this first Communication, we take the canonical path through elliptic PDEs and focus on steady-state diffusion, linear and nonlinear elasticity. Importantly, the learning happens independently of any labeled field data from either experiments or direct numerical solutions. We develop probabilistic CNNs in the Bayesian setting using variational inference. Extensive results for these problem classes demonstrate the framework’s ability to learn PDE solvers that generalize across hundreds of thousands of domains, boundary conditions and coefficients, including extrapolation beyond the learning regime. Once trained, the machine learning solvers are orders of magnitude faster than discretization-based solvers. They therefore have relevance to high-throughput solutions of PDEs on varied domains, boundary conditions and coefficients, such as for inverse modeling, optimization, design and decision-making. We place our work in the context of other recent machine learning solvers of PDEs including continuous operator learning frameworks, as well as make comparisons of performance where possible. Finally, we note extensions to transfer learning, active learning and reinforcement learning.