Biological soil crusts (biocrusts) form a layer of only one to few centimeters depth on the soil surface and occur mostly in hot and cold deserts. Biocrusts have a major impact on different processes in these ecosystems, like carbon and nitrogen cycling, biodiversity preservation, erosion protection and soil dust emission reduction, but also react highly sensitive upon climate alterations and land use intensification. Therefore, monitoring tools are required to keep track of the changes of these specialized communities in an altering environment. In the current study, we applied a semantic image segmentation approach, using neural networks. One main problem to be solved was, that the training data and target data, on which the model is applied, are often recorded with different camera devices. This leads to different statistical properties of the image data, like different scale, resolution, brightness etc., which could significantly affect the model's performance. To solve this problem, we propose a new domain adaption method using a joint energy-based approach. To test a semantic segmentation approach in general, we utilized biocrust imagery taken in Utah (United States of America) and two sub datasets from the National Park Gesäuse (Austria). Here, we achieved highly reliable results with an overall classification accuracy of 85.9% for the USA data and 88.6% and 91.4%, respectively, for the two sub datasets of the National Park Gesäuse. To test our joint energy-based domain adaption approach, we used the two sub datasets from the National Park Gesäuse, which were recorded with different camera devices. With this newly established approach, we improved the accuracy of our segmentation on the unlabeled sub dataset from 70.4% to 75.3%. The results suggest that joint energy-based modelling is a well-suited domain adaption method for semantic segmentation that could be applied to face various deep learning and image-based biomonitoring challenges.