The article focuses on researching the construction of a transparent object (glass) recognition model based on the application of computer vision techniques and artificial intelligence models. Stereo Matching image processing techniques have been used to build a raw depth image from a Stereo Camera. The goal is to reconstruct the depth image, recover in-depth information, and generate a complete depth image to effectively identify the position of transparent objects in reality. Additionally, the research involves designing a software interface for observing depth images, point clouds, and controlling the robotic arm for object grasping in threedimensional space. The following results were obtained: The quality of the depth image reconstruction model is improved compared to the ClearGrasp model when evaluated on ClearGrasp datasets; Determine orientations on how to improve models and algorithms for reconstructing depth images in a more quantitative manner. The success rate of picking up a glass cup is over 90% in cases of objects on the floor; This rate reaches over 70% when objects are placed at different heights. The software interface displays detailed information and facilitates communication, controlling depth images, point clouds, and position graphs (x, y, z). At the same time, it is easy to interact and convenient during the experiment