Text entry is extremely difficult or sometimes impossible in the scenarios of situationally-induced impairments and disabilities, and for individuals with motor impairments (physical impairments and disabilities) by birth or due to an injury. As a remedy, many rely on gaze typing with dwell-based selection as it allows for hands-free text entry. However, dwell-based gaze typing could be limited by usability issues, reduced typing speed, high error rate, steep learning curve, and visual fatigue with prolonged usage. Addressing these issues is crucial for improving the usability and performance of gaze typing.In our work, we present a dwell-free, multimodal approach to gaze typing where the gaze input is supplemented with a foot input modality. Our combined gaze and foot-based typing system comprises of an enhanced virtual QWERTY keyboard (VKB), and a footwear augmenting wearable device that provides the foot input. In this multi-modal setup, the user points her gaze at the desired character, and selects it with the foot input. We further investigated two approaches to foot-based selection, a foot gesture-based selection and a foot press-based selection, which are compared against the standard dwell-based selection.We evaluated our gaze typing system through a comparative study involving three experiments (51 participants), where each experiment used one of the three target selection methods, and had 17 participants in it. In the first experiment the participants used dwell-based selection, second, foot gesture-based selection, and third, foot press-based selection for gaze typing. We found that with dwell-based selection the highest mean typing speed of 11.65 WPM (max 14.83 WPM) was achieved when using a dwell time of 400 ms. Similarly, among foot-based selection methods the highest mean typing speed of 14.98 WPM (max 18.18 WPM) was achieved with foot press-based selection. Furthermore, ANOVA tests revealed that the difference in the typing speeds between the three selection methods is significant, however, no significant difference was found in the error rate.Overall, based on both typing performance and qualitative feedback the results suggest that gaze and foot-based typing is convenient, easy to learn, and addresses the usability issues associated with dwell-based typing. Furthermore, toe tapping is the most preferred foot gesture of all the four gestures (toe tapping, heel tapping, right flick and left flick) we used in the study. Also, we found that when using foot-based selection users quickly develop a rhythm in focusing at a character with gaze and selecting it with the foot, and this familiarity reduces the errors significantly. We believe, our findings would encourage further research in leveraging a supplemental foot input in gaze typing, or in general, would assist in the development of rich foot-based interactions.