<p>The operation space of the vertical lift shaft is small, the components are complex, the occluding and different behavior space characteristics are similar, and the unsafe behavior is not easy to detect, which makes the operation safety of maintenance personnel in the elevator greatly threatened. This paper proposes an elevator maintenance personnel behavior detection algorithm based on the first-order deep network architecture (FOA-BDNet). First, a lightweight backbone feature extraction network is designed to meet the online real-time requirements of elevator maintenance environment monitoring video stream detection. Then, the feature fusion network structure of "far intersection and close connection" is proposed to fuse the fine-grained information with the coarse-grained information and to enhance the expression ability of deep semantic features. Finally, a first-order deep target detection algorithm adapted to the elevator scene is designed to identify and locate the behavior of maintenance personnel and to correctly detect unsafe behaviors. Experiments show that the detection accuracy rate on the self-built data set in this paper is 98.68%, which is 4.41% higher than that of the latest target detection model YOLOv8-s, and the reasoning speed reaches 69.51fps/s, which can be easily deployed in common edge devices and meet the real-time detection requirements for the unsafe behaviors of elevator scene maintenance personnel.</p>