The current gold standard in solving image processing and computer vision tasks is using supervised learning of deep neural networks (DNNs), requiring large-scale datasets of input-output pairs. In many scenarios in which the output is an image - e.g., medical image analysis, image denoising, deblurring, super-resolution, dehazing, segmentation and optical flow estimation - the collection of labelled image pairs for training is either time-consuming or limited to simple degradation models. Indeed, there is an increasing body of work targeted at weakly supervised training, accompanied with different unsupervised loss functions. This work dives into the regime of Deep-Energy, a task-driven training approach that substitutes the generic loss with minimization of energy functions using DNNs. Such energy functions, often formulated as a combination of a data-fidelity term along with an application-specific prior, are essentially unsupervised as they do not assume knowledge of the output image. As opposed to classic energy minimization, where computationally-intensive inference is performed for each new image, our network, once trained, can compute the output with a single forward-pass operation. By incorporating application-specific domain knowledge to the loss function, we are able to use real-world images, thus decreasing the dependency on pixel-wise labelled data or synthetic datasets. We demonstrate our approach on three different applications: seeded segmentation, image matting and single image dehazing, showing clear benefits of both speedup and improved accuracy versus the classical energy minimization approach, and competitive performance with respect to fully supervised alternatives.