In this study, a novel framework for estimating a user's gaze point is proposed. If it is possible to detect what a user is looking at, appropriate services can be provided accordingly. Most existing methods for gaze estimation using image processing are classified into two types: those using the third-person viewpoint and those using the first-person viewpoint. However, the former approach lacks accurate estimation and the latter approach can cause privacy issues. In the proposed framework, sensor information from acceleration and gyro sensors installed in mobile devices is utilized instead of the first-person camera. From the images obtained from the third-person camera, a heatmap showing the possibility of objects that the user is looking at is estimated using machine learning techniques. This information is combined with the position of the user's head, which is obtained from the sensor information, to estimate the location of the user's gaze point. Experimental results show that the proposed method achieves a much higher accuracy than existing techniques. Obtaining user gaze information is very helpful in providing advanced location-based services (LBSs). The proposed framework can increase the added value of various types of LBSs.