The cross-ratio (CR)-based method exploits the invariance property of CRs in projective transformation to determine a screen point corresponding to the pupil center. However, this point is essentially the intersection of the eyeball optical axis (OA) and the screen, rather than the actual point-of-regard (POR). In addition, the premise of CR calculation is that the corneal reflection points of four on-screen light sources are coplanar with the 3D pupil center, but they are only assumed to be coplanar. To solve these issues, this paper proposes an improved CR-based gaze estimation method using weighted average and polynomial compensation. Under the configuration of a single camera and two light sources, the 3D corneal center and the normal vector of virtual pupil plane are first estimated using the eyeball imaging model, and then four reference planes parallel to the virtual pupil plane are determined based on the geometric model of pupil and screen corner points. The screen point corresponding to the intersection of each reference plane and the line connecting the camera optical center and the imaging pupil center is calculated using the conventional CR-based method. Thus, the point where the OA intersects the screen is determined by the weighted average of these four points. Finally, a polynomial is learned to compensate it to the POR. The experimental results show that the gaze accuracy can reach 1.33° when the more accurate eye is selected, and it would be improved by 24% by calculating the joint POR, which is competitive with the state-of-the-art methods using more complex systems. On the basis of simplifying the system configuration of CR-based methods, the proposed method avoids the non-coplanarity of 3D pupil center and corneal reflection plane, and improves the gaze estimation performance.