The hand gesture recognition system is a noncontact and intuitive communication approach, which, in turn, allows for natural and efficient interaction. This work focuses on developing a novel and robust gesture recognition system, which is insensitive to environmental illumination and background variation. In the field of gesture recognition, standard vision sensors, such as CMOS cameras, are widely used as the sensing devices in state-of-the-art hand gesture recognition systems. However, such cameras depend on environmental constraints, such as lighting variability and the cluttered background, which significantly deteriorates their performances. In this work, we propose an event-based gesture recognition system to overcome the detriment constraints and enhance the robustness of the recognition performance. Our system relies on a biologically inspired neuromorphic vision sensor that has microsecond temporal resolution, high dynamic range, and low latency. The sensor output is a sequence of asynchronous events instead of discrete frames. To interpret the visual data, we utilize a wearable glove as an interaction device with five high-frequency (>100 Hz) active LED markers (ALMs), representing fingers and palm, which are tracked precisely in the temporal domain using a restricted spatiotemporal particle filter algorithm. The latency of the sensing pipeline is negligible compared with the dynamics of the environment as the sensor's temporal resolution allows us to distinguish high frequencies precisely. We design an encoding process to extract features and adopt a lightweight network to classify the hand gestures. The recognition accuracy of our system is comparable to the state-of-the-art methods. To study the robustness of the system, experiments considering illumination and background variations are performed, and the results show that our system is more robust than the state-of-the-art deep learning-based gesture recognition systems. Note to Practitioners-This article addresses the robustness of the hand gesture recognition system that is important for gesture recognition-based applications. Existing methods rely on either the large-volume data to train a deep learning model or to restrict the applied environments (e.g., an ideal environment without dynamic background). However, a vision-based deep learning model requires large computational resources, while the ideal environment limits the practicality of the system. In this work, we introduce a biologically inspired neuromorphic vision sensor and an ALM glove and build a novel gesture recognition system to tackle the above issue. The neuromorphic vision sensor has a microsecond temporal resolution and a high dynamic range. With these properties, the sensing system of our prototype operates in a very low-latency space, which, in turn, ensures that our gesture recognition system is robust to illumination variance and dynamic background. Thus, this work is valuable to the research of illumination-robust gesture recognition systems. Preliminary experiments suggest that our system prototype is feasible, but it has not yet been incorporated into an online gesture recognition system nor tested with complex gestures. In future work, we will concentrate on the improvement of the signal processing methods that advance the current system to complex and practical applications.