The analysis of teacher behavior of massive teaching videos has become a surge of research interest recently. Traditional methods rely on accurate manual analysis, which is extremely complex and time-consuming for analyzing massive teaching videos. However, existing works on action recognition are difficultly transplanted to the teacher behavior recognition, because it is difficult to extract teacher’s behavior from complex teaching scenario, and teacher’s behaviors are given professional educational semantics. These methods are not adequate for the need of the teacher behavior recognition. Thus, a novel and simple recognition method of teacher behavior in the actual teaching scene for massive teaching videos is proposed, which can provide technical assistance for analyzing teacher behavior and fill the blank of automatic recognition of teacher behavior in actual teaching scene. Firstly, we discover the educational pattern which it be named “teacher set”, that is, the spatial region of the video of the whole class where teachers should exist. Based on this, the algorithm of teacher set identification and extraction (Teacher-set IE algorithm) is studied to identify the teacher in the teaching video, and reduce the interference factors of classroom background. Then, an improved behavior recognition network based on 3D bilinear pooling (3D BP-TBR) is presented to enhance fusion representation of three-dimensional features thus identifying the categories of teacher behavior, and experiments show that 3D BP-TBR can achieve better performance on public and self-built dataset (TAD-08). Hence, our whole approach can increase recognition accuracy of teacher behavior in the actual teaching scene to utilize the deep integration of educational characteristics and action recognition technology.