We are witnessing a rise in the use of ground and aerial robots in first response missions. These robots provide novel opportunities to support first responders and lower the risk to people’s lives. As these robots become increasingly autonomous, researchers are seeking ways to enable natural communication strategies between robots and first responders, such as using gestural interaction. First response work often takes place in harsh environments, which hold unique challenges for gesture sensing and recognition, including in low-visibility environments, making the gestural interaction non-trivial. As such, an adequate choice of sensors and algorithms needs to be made to support gestural recognition in harsh environments. In this work, we compare the performances of three common types of remote sensors, namely RGB, depth, and thermal cameras, using various algorithms, in simulated harsh environments. Our results show 90 to 96% recognition accuracy (respectively with or without smoke) with the use of protective equipment. This work provides future researchers with clear data points to support them in their choice of sensors and algorithms for gestural interaction with robots in harsh environments.