Augmented Reality (AR)–based video telephony service can allow mobile users a better user experience (UX) since it allows participants to place and transmit augmented objects on video frames to a peer. However, there are quite a few AR-based mobile video communication models today, yet the existing models are limited and insufficient in supporting technical service such as real-time object detection, dynamic data selection, and discrimination between local data augmentation and remote data augmentation. This paper presents an enhanced AR–based mobile video telephony scheme, in which the object of interest can be dynamically combined with a video frame through real-time object detection, and users can immediately share their experience with their friend during a video call. In order to evaluate the effectiveness and feasibility of the proposed scheme, an application has been implemented on the mobile system and the computational time has been measured. Experimental results show that the proposed system can give customers better UX with small increase of computational time.