As two important contents in WiFi-based action perception, detection and recognition require localizing motion regions from the entire temporal sequences and classifying the corresponding categories. Existing approaches, though yielding reasonably acceptable performances, are suffering from two major drawbacks: heavy empirical dependency and large computational complexity. In order to solve these issues, we develop LiteWiSys in this article, a lightweight system in an end-to-end deep learning manner to simultaneously detect and recognize WiFi-based human actions. Specifically, we assign different attentions on sub-carriers, which are then compressed to reduce noise and information redundancy. Then, LiteWiSys integrates deep separable convolution and a channel shuffle mechanism into a multi-scale convolutional backbone structure. By feature channel split, two network branches are obtained and further trained with a joint loss function for dual tasks. We collect different datasets at multi-scenes and conduct experiments to evaluate the performance of LiteWiSys. In comparison to existing WiFi sensing systems, LiteWiSys achieves promising precision with lower complexity.
Read full abstract