Oblique photography is a regional digital surface model generation technique that can be widely used for building 3D model construction. However, due to the lack of geometric and semantic information about the building, these models make it difficult to differentiate more detailed components in the building, such as roofs and balconies. This paper proposes a deep learning-based method (U-NET) for constructing 3D models of low-rise buildings that address the issues. The method ensures complete geometric and semantic information and conforms to the LOD2 level. First, digital orthophotos are used to perform building extraction based on U-NET, and then a contour optimization method based on the main direction of the building and the center of gravity of the contour is used to obtain the regular building contour. Second, the pure building point cloud model representing a single building is extracted from the whole point cloud scene based on the acquired building contour. Finally, the multi-decision RANSAC algorithm is used to segment the building detail point cloud and construct a triangular mesh of building components, followed by a triangular mesh fusion and splicing method to achieve monolithic building components. The paper presents experimental evidence that the building contour extraction algorithm can achieve a 90.3% success rate and that the resulting single building 3D model contains LOD2 building components, which contain detailed geometric and semantic information.