Due to the complex structure of high-canopy-density forests, the traditional individual tree segmentation (ITS) algorithms based on ALS point cloud, which set segmentation threshold manually, is difficult to adequately cover a variety of complex situations, so the ITS accuracy is unsatisfactory. In this paper, a top-down segmentation strategy is adopted to propose an adaptive segmentation method based on extreme offset deep learning, and the ITS set aggregation strategy based on gradient change criterion is designed for the over-segmentation generated by random offset, and the precise ITS is realized. Firstly, the segmentation sub-plot is set as 25 m × 25 m, the regional point cloud and its treetop are marked, and the offset network is trained. Secondly, the extreme offset network is designed to carry out spatial transformation of the original point cloud, and each point is offset to the position near the treetop to obtain the offset point cloud with a high density at the treetop, which enhances the discrimination among individual trees. Thirdly, the self-adaptive mean shift algorithm based on average neighboring distance is designed to divide and mark the offset point cloud. Fourthly, the offset point cloud, after clustering, is mapped back to the original space to complete the preliminary segmentation. Finally, according to the gradient change among different canopies, the ITS aggregation method is designed to aggregate adjacent canopies with a gentle gradient change. In order to investigate the universality of the proposed method on different stand structures, two coniferous forest plots (U1, U2) in the Blue Ridge area of Washington, USA, and two mixed forest plots (G1, G2) in Bretten, Germany, are selected in the experiment. The learning rate of the deep network is set as 0.001, the sampled point number of the sub-plot is 900, the transformer dimension is 512 × 512, the neighboring search number of points is 16, and the number of up-sampling blocks is 3. Experimental results show that in mixed forests (G1, G2) with complex structures, the F-score of the proposed method reaches 0.89, which is about 4% and 7% higher than the classical SHDR and improved DK, respectively. In high-canopy-density areas (U2, G2), the F-score of the proposed method reaches 0.89, which is about 3% and 4% higher than the SHDR and improved DK, respectively. The results show that the proposed method has high universality and accuracy, even in a complex stand environment with high canopy density.