The realistic reconstruction of real-world trees is a challenging task in the community of computer graphics because natural trees have complex structures of branches and leaves. Existing terrestrial laser scanning (TLS) system is able to capture dense and precise tree point clouds, yet the TLS system is expensive and not easy to carry around. An alternative low-cost and portable way is the reconstruction of tree point cloud from multiple view images. However, it is usually difficult to reconstruct a complete tree point cloud because of the texture similarity of branches and leaves as well as the lack of a sufficient number of images. Thus, we propose a new approach for reconstructing tree point clouds and geometries from sparse images. We first infer the camera parameters of each image, and then calculate the bounding volume of a tree from the camera parameters. Next, we set the mask of each image and the resolution of voxel, and then project each voxel in 3D space to all the mask images to determine the validity of the voxel. To alleviate the miss deletion of valid voxel, we utilize a boundary threshold and adjust mask resolution for robust point cloud reconstruction. Finally, an efficient tree reconstruction method is proposed to generate plausible tree geometries. We tested 6 different tree species that contain deciduous and evergreen trees, and the results showed that our approach is able to generate complete tree point cloud and realistic tree models even from a few number of images.